Gene involved in V(D)J recombination and/or DNA repair

ABSTRACT

The invention related to a new gene and protein involved in V(D)J recombination and/or DNA repair. The invention also relates to methods of diagnosis and therapy using these gene and protein. The invention also relates to transgenic animals over- or under-expressing said gene.

FIELD OF THE INVENTION

The present invention relates to a novel DNA sequence that is involvedin V(D)J recombination in lymphocytes, and the mutation of which causesSevere Combined Immunodeficiencies (SCID). The invention also relates tomethods of diagnosis, methods of therapy, and methods of screening ofnew compounds using this sequence, as well as to transgenic animals.

BACKGROUND OF THE INVENTION

B and T lymphocytes recognize foreign antigen through specializedreceptors: the immunoglobulins and the T cell receptor (TCR)respectively. The highly polymorphic antigen-recognition regions ofthese receptors are composed of variable (V), diversity (D), and joining(J) gene segments which undergo somatic rearrangement prior to theirexpression by a mechanism known as V(D)J recombination (Tonegawa, 1983).Each V, D, and J segment is flanked by Recombination Signal Sequences(RSSs) composed of conserved heptamers and nonamers separated by randomsequences of either 12 or 23 nucleotides. RSSs serve as recognitionsequences for the V(D)J Recombinase.

V(D)J recombination can be roughly divided into three steps. The RAG1and RAG2 proteins initiate the rearrangement process through therecognition of the RSS and the introduction of a DNA double strand break(dsb) at the border of the heptamer (Schatz et al., 1989; Oettinger,1990). RAG1 and RAG2 are the sole two factors required to catalyze DNAcleavage in cell-free systems (McBlane et al., 1995; Van Gent et al.,1995; Eastman et al., 1996) in a reaction reminiscent of retroviralintegration and transposition (van Gent et al., 1996; Roth and Craig,1998). Three acidic residues, DDE, were shown to compose the active sitecarried by RAG1 (Kim et al., 1999; Landree et al., 1999; Fugmann et al.,2000). The restricted expression of both RAG1 and RAG2 genes to immatureB and T lymphocytes confines V(D)J recombination to the lymphoidlineage. At the end of this phase, which causes a DNA damage, thechromosomal DNA is left with two hairpin-sealed coding ends (CE), whilethe RSSs and the DNA intervening sequences are excised from thechromosome as blunt, phosphorylated signal ends (SE) (Roth et al., 1992;Schlissel et al., 1993; Zhu and Roth, 1995). The subsequent stepconsists in recognition and signaling of the DNA damage to the DNArepair machinery. From now on, ubiquitous enzymatic activities areinvolved.

The description of the murine scid situation, characterized by a lack ofcirculating mature B and T lymphocytes (Bosma et al., 1983), as ageneral DNA repair defect accompanied by an increased sensitivity toionizing radiation or other agents causing DNA dsb provided the linkbetween V(D)J recombination and DNA dsb repair (Fulop, 1990; Biedermann,1991; Hendrickson, 1991). This was further confirmed by the analysis ofChinese ovary cell lines (CHO), initially selected on the basis of theirdefect in DNA repair, which turned out to have impaired V(D)Jrecombination in vitro (Taccioli et al., 1993). This led to thedescription of the Ku70/Ku80/DNA-PKcs complex as a DNA damage sensor(review in (Jackson and Jeggo, 1995)). Briefly, DNA-PKcs is aDNA-dependant protein kinase that belongs to the Phosphoinositol (PI)kinase family, which is recruited at the site of the DNA lesion throughthe interaction with the regulatory complex Ku70/80 that binds to DNAends (Gottlieb and Jackson, 1993). Cells from scid mice lack DNA-PKactivity owing to a mutation in the DNA-PKcs encoding gene (Blunt etal., 1996; Danska et al., 1996). This severely compromises the V(D)Jrecombination process, ultimately leading to an arrest in both B and Tcell development.

More recently, two other proteins, NBS1 and γ-H2AX, have been identifiedon the site of chromosomal rearrangement in the TCR-α locus inthymocytes (Chen et al., 2000). NBS 1, which is mutated in the Nijmegenbreakage syndrome, participates in the formation of the RAD50/MRE11/NBS1complex involved in DNA repair (Carney et al., 1998; Varon et al.,1998). γ-H2AX represents the phosphorylated form of histone H2A inresponse to external damage and is considered as an important sensor ofDNA damage (Rogakou et al., 1998; Rogakou et al., 1999; Paull et al.,2000). The biological implication of this observation is not yet fullyunderstood, but it indicates that the RAD50/MRE11/NBS1 complex maycooperate with the DNA-PK complex in sensing and signaling theRAG1/2-mediated DNA dsb to the cellular DNA repair machinery. In thefinal phase of the V(D)J rearrangement, the DNA-repair machinery per sewill ensure the re-ligation of the two chromosomal broken ends. Thislast step resembles the well-known DNA non-homologous end joining (NHEJ)pathway in the yeast Sacchaomyces cerevisiae (review in (Haber, 2000))and involves the XRCC4 (Li et al., 1995) and the DNA-Ligase IV (Robinsand Lindahl, 1996) factors. The crystal structure recently obtained forXRCC4 demonstrates the dumb-bell like conformation of this protein andprovides a structural basis for its binding to DNA as well as itsassociation with DNA-Ligase IV (Junop et al., 2000). All the animalmodels carrying a defective gene of either one of the known V(D)Jrecombination factors, either natural (murine and equine scid) orengineered through homologous recombination, have a profound defect inthe lymphoid developmental program owing to an arrest of the B and Tcell maturation at early stages (Mombaerts et al., 1992; Shinkai et al.,1992; Nussenzweig et al., 1996; Zhu et al., 1996; Jhappan et al., 1997;Shin et al., 1997; Barnes et al., 1998; Frank et al., 1998; Gao et al.,1998; Gao et al., 1998; Taccioli et al., 1998). In the cases ofDNA-LigaseIV and XRCC4 this phenotype is also accompanied by an earlyembryonic lethality caused by massive apoptotic death of postmitoticneurons (Barnes et al., 1998; Frank et al., 1998; Gao et al., 1998).

In humans, several immune deficiency conditions are characterized byfaulty T and/or B cell developmental program (Fischer et al., 1997). Inabout 20% of the cases, the severe combined immunodeficiency (SCID)phenotype is caused by a complete absence of both circulating B and Tlymphocytes, associated with a defect in the V(D)J recombinationprocess, while Natural Killer (NK) cells are present. Mutations ineither the RAG1 or RAG2 gene account for a subset of patients with thiscondition (Schwarz et al., 1996; Comeo et al., 2000; Villa et al.,2001). In some patients (RS-SCID), the T-B-SCID defect is not caused byRAG1 or RAG2 mutations and is accompanied by an increased sensitivity toionizing radiations of both bone marrow cells (CFU-GMs) and primary skinfibroblasts (Cavazzana-Calvo et al., 1993), as well as a defect in V(D)Jrecombination in fibroblasts (Nicolas et al., 1998).

Although this condition suggests that RS-SCID could have a generalDNA-repair defect reminiscent of the murine scid situation, DNA-PKactivity was found normal in these patients and the implication of theDNA-PKcs gene has been unequivocally ruled out by genetic means inseveral consanguineous families (Nicolas et al., 1996). A role for allthe other known genes involved in V(D)J recombination/DNA repair wasequally excluded as being responsible for RS-SCID condition (Nicolas etal., 1996). The gene defective in RS-SCID therefore encodes a yetundescribed factor. The inventors recently assigned the disease relatedlocus to the short arm of human chromosome 10, in a 6.5 cM regiondelimited by two polymorphic markers D10S1664 and D10S674 (Moshous etal., 2000), a region shown to be linked to a similar SCID conditiondescribed in Athabascan speaking American Indians (A-SCID) (Hu et al.,1988; Li et al., 1998).

SUMMARY OF THE INVENTION

The present invention relates to the identification and cloning of theArtemis gene, localized in this region of Chromosome 10. Artemis codesfor a novel V(D)J recombination and/or DNA repair factor that belongs tothe metallo β-lactamase superfamily and whose mutations give rise to thehuman RS-SCID condition.

In particular, the present invention relates to an isolated nucleic acidmolecule selected from the group consisting of:

-   -   a) SEQ ID No 1, nucleotides 39-2114 of SEQ ID No 1, or        nucleotides 60-2114 of SEQ ID No 1    -   b) an isolated and purified nucleic acid comprising the nucleic        acid of a)    -   c) an isolated nucleic acid that specifically hybridizes under        (highly) stringent conditions to the complement of the nucleic        acid of a) (under high stringency conditions of 0.2×SSC and 0.1%        SDS at 55-65° C.), preferably wherein said nucleic acid encodes        a protein that is involved in the V(D)J recombination and/or DNA        repair    -   d) an isolated nucleic acid having at least 80% homology with        the nucleic acid of a), preferably over the full length of SEQ        ID No 1, and preferably wherein said nucleic acid encodes a        protein that is involved in the V(D)J recombination and/or DNA        repair    -   e) a fragment of the nucleic acid of a) comprising at least 15        nucleotides, with the proviso that said fragment is not entirely        comprised between nucleotides 158-609, 607-660, or 29-537 of SEQ        ID No 1.

The present invention also relates to the polypeptides coded by thenucleic acid of the invention and to different uses that can be madewith the objects of the invention.

DESCRIPTION OF THE FIGURES

FIG. 1 represents a schematic view of the genomic organization of theArtemis gene, with the exons represented as rectangles, and thedifferent mutations identified in RS-SCID patients. The 12-mernucleotide seuuence is shown in SEQ ID NO: 34.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

In a first aspect, the present invention relates to an isolated nucleicacid molecule selected from the group consisting of:

-   -   a) SEQ ID No 1 nucleotides 39-2114 of SEQ ID No 1, or        nucleotides 60-2114 of SEQ ID No 1,    -   b) an isolated nucleic acid comprising the nucleic acid of a),    -   c) an isolated nucleic acid that specifically hybridizes under        stringent conditions to the complement of the nucleic acid of        a),    -   d) an isolated nucleic acid having at least 80% homology with        the nucleic acid of a),    -   e) a fragment of the nucleic acid of a) comprising at least 15        nucleotides, with the proviso that said fragment is not entirely        comprised between nucleotides 158-609, 607-660, or 29-537 of SEQ        ID No 1.

It is also envisioned that the invention encompasses the genomic nucleicacid sequence that leads to SEQ ID No 1 after transcription. The genomicnucleic acid sequence can easily be obtained from SEQ ID No 1 by theperson skilled in the art by screening a library of genomic DNA, using aprobe derived from SEQ ID No 1. It may also be obtained starting fromthe sequence, having the GenBank accession number AL360083 that may beavailable at the following address:http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=Nucleotide&list_uids=112584428&dopt=GenBank.

The invention also encompasses an isolated nucleic acid molecule that isthe complement of the isolated nucleic acid molecule of the invention,as described above.

In a preferred embodiment, the nucleic acid of c) specificallyhybridizes under highly stringent conditions to the complement of thenucleic acid of a) (for example under high stringency conditions of0.2×SSC and 0.1% SDS at 55-65° C., or under conditions as describedbelow), and in a preferred embodiment, encodes a protein that has abiological activity of V(D)J recombination and/or DNA repair.

In other embodiments, the nucleic acid of d) harbors 85, 90, 95, 98 or99% homology with the nucleic acid of a). The homology is preferablycalculated over the full length of SEQ ID No 1 or nucleotides 39-2114 ofSEQ ID No 1. In a preferred embodiment, said nucleic acid encodes aprotein that has a biological activity of V(D)J recombination and/or DNArepair.

By nucleic acid, nucleic sequence, nucleic acid sequence,polynucleotide, oligonucleotide, polynucleotide sequence, nucleotidicsequence, all terms that will be indifferently used in the presentapplication, one designates a specific string of nucleotides, modifiedor not, that defines a fragment or region of a nucleic acid, comprisingnatural or non-natural nucleotides. It may be a double strand DNA, asingle strand DNA, or transcription products of said dans. The sequencesaccording to the invention also comprise Peptid Nucleic Acid, oranalogs, or sequences with modified nucleotides (phosphorothioates,methylphosphonates . . . ).

By isolated, it is meant that the nucleic acid of the invention is notin its natural chromosomal environment. The sequences according to theinvention have been isolated and/or purified, meaning that they havebeen directly or indirectly obtained (for example by copy throughamplification), their natural environment being at least partiallymodified. The nucleic acids that have been obtained through chemicalsynthesis are also part of the present invention.

The stringent hybridization conditions may be defined as described inSambrook et al. ((1989) Molecular cloning: a laboratory manual. 2^(nd)Ed. Cold Spring Harbor Lab., Cold Spring Harbor, N.Y.), with thefollowing conditions: 5× or 6×SCC, 50-65° C. Highly stringent conditionsthat can also be used for hybridization are defined with the followingconditions: 6×SSC, 60-65° C.

Hybridization ADN-ADN or ADN-ARN may be performed in two steps: (1)prehybridization at 42° C. pendant 3 h in phosphate buffer (20 mM, pH7.5) containing 5 or 6×SSC (1×SSC corresponding to a solution 0.15 MNaCl+0.015 M sodium citrate), 50% formamide, 7% sodium dodecyl sulfate(SDS), 10× Denhardt's, 5% dextran sulfate et 1% salmon sperm DNA; (2)hybridization during up to 20 at a temperature of 50-65° C., morepreferably 60-65° C. followed by different washes (about 20 minutes atin 2×SSC+2% SDS, then 0.1×SSC+0.1% SDS). The last wash is performed in0.2×SSC+0.1% SDS for about 30 minutes at about 50-65° C., and/or in0.1×SSC+0.1% SDS at the same temperature. These high stringencyhybridization conditions may be adapted by a person skilled in the art.Indeed, the person skilled in the art is able to determine the beststringency conditions by varying the concentrations in SSC and SDS andthe temperature of hybridization and washings.

The term “conditions of high stringency” also refers to hybridizationand washing under conditions that permit binding of a nucleic acidmolecule used for screening, such as an oligonucleotide probe or cDNAmolecule probe, to highly homologous sequences. An exemplary highstringency wash solution is 0.2×SSC and 0.1% SDS used at a temperatureof between 50-65° C.

Where oligonucleotide probes are used to screen cDNA or genomiclibraries, one of the following two high stringency solution may beused. The first of these is 6×SSC with 0.05% sodium pyrophosphate at atemperature of 35° C.-62° C., depending on the length of theoligonucleotide probe. For example, 14 base pair probes are washed at35° C.-40° C., 17 base pair probes are washed at 45° C.-50° C., 20 basepair probes are washed at 52° C.-57° C., and 23 base pair probes arewashed at 57-63° C. The temperature can be increased 2-3° C. where thebackground non-specific binding appears high. A second high stringencysolution utilizes tetramethylammonium chloride (TMAC) for washingoligonucleotide probes. One stringent washing solution is 3 M TMAC, 50mM Tris-HCl, pH 8.0, and 0.2% SDS. The washing temperature using thissolution is a function of the length of the probe. For example, a 17base pair probe is washed at about 45-50° C.

Two polynucleotides are said to be “identical” or “homologous” if thesequence of nucleotides or amino acid residues, respectively, in the twosequences is the same when aligned for maximum correspondence asdescribed below. The term “complementary to” is used herein to mean thatthe complementary sequence is identical to all or a specified contiguousportion of a reference polynucleotide sequence. Sequence comparisonsbetween two (or more) polynucleotides or polypeptides are typicallyperformed by comparing sequences of two optimally aligned sequences overa segment or “comparison window” to identify and compare local regionsof sequence similarity. Optimal alignment of sequences for comparisonmay be conducted by the local homology algorithm of Smith and Waterman,Ad. App. Math 2: 482 (1981), by the homology alignment algorithm ofNeddleman and Wunsch, J. Mol. Biol. 48:443 (1970), by the search forsimilarity method of Pearson and Lipman, Proc. Natl. Acad. Sci. (U.S.A.)85:2444 (1988), by computerized implementation of these algorithms (GAP,BESTFIT, BLAST N, BLAST P, FASTA, and TFASTA in the Wisconsin GeneticsSoftware Package, Genetics Computer Group (GCG), 575 Science Dr.,Madison, Wis.), or by inspection. In order to determine the optimalwindow of alignment, the BLAST program could be used, using matrixBLOSUM 62, or matrices PAM or PAM250, with the default parameters, orparameters modified in order to increase the specificity.

“Percentage of sequence identity or homology” is determined by comparingtwo optimally aligned sequences over a comparison window, where theportion of the polynucleotide sequence in the comparison window maycomprise additions or deletions (i.e., gaps) as compared to thereference sequence (which does not comprise additions or deletions) foroptimal alignment of the two sequences. The percentage is calculated bydetermining the number of positions at which the identical nucleic acidbase or amino acid residue occurs in both sequences to yield the numberof matched positions, dividing the number of matched positions by thetotal number of positions in the window of comparison and multiplyingthe result by 100 to yield the percentage of sequence identity.

The nucleic acid of d) presents an homology of at least 80%, morepreferably 90%, more preferably 95%, more preferably 98%, the mostpreferable being 99% with the nucleic acid of a).

Preferred nucleic acids contain a mutation in the codons correspondingthe Histidine and the Aspartic Acid residues, that are located betweennucleotides 39-488, 60-488, 39-1193 and 60-1193.

A preferred acid nucleic contains a mutation at one nucleotide chosen inthe group consisting of nucleotides 87, 88, 89, 141, 142, 143, 147, 148,149, 150, 151, 152, 444, 445, 446, 489, 490, 491, 531, 532, 533, 993,994 and 995 of SEQID No 1.

A preferred nucleic acid contains a mutation at one nucleotide chosen inthe group consisting of nucleotides 150, 151, 152, 489, 490, and 491 odSEQ ID No 1.

The fragment of the nucleic acid of a) contain at least 15 bases, morepreferably 25, 50, 60, 75, 100, 150, 200, 250, 300, 400, 500, 600, 750,1000, 1250, 1500, 2000 bases. This fragments may be used as primers foramplification, or as probes especially when looking for homologous DNAor DNA hybridizing with the nucleic acid of a). These fragments may alsobe labeled.

The different labels that may be used are well known to the personskilled in the art, and one can cite ³²P, ³³P, ³⁵S, ³H or ¹²⁵I. Nonradioactive labels may be selected from ligants as biotin, avidin,streptavidin, dioxygenin, haptens, dyes, luminescent agents likeradioluminescent, chemoluminescent, bioluminescent, fluorescent orphosphorescent agents.

The fragments according to the invention are preferably biologicallyactive, i.e. they harbor the same biological activity than the nativeprotein coded by nucleotides 39-2114 of SEQ ID No 1, or nucleotides60-2114 of SEQ ID No 1. This protein is involved in V(D)J recombinationand/or DNA repair, and the tests that can be used by the person skilledin the art to assess these activities are well known, one of them beingdescribed in the examples.

The most preferred fragments of the nucleic acid of a) are the fragmentscoding for a protein or a polypeptide (nucleotides 39-2114 or 60-2114 ofSEQ ID No 1). One can also cite the nucleotides 1-38, and 2115-2354 ofSEQ ID No 1 that may contain regulatory sequences (promoters, enhancers. . . ). Another interesting fragment is the fragment coding for ametallo-β-lactamase region (39-488).

The inventors have also demonstrated that it is possible to obtain theV(D)J recombination activity, by using only a fragment of the protein,that is coded by nucleotides 39-1193 or 60-1193. This specific fragmentis also preferred in the present invention.

Interesting fragments are also the fragments corresponding to the exons,i.e. corresponding, in SEQ ID No 1, to nucleotides 1-147, 39-147,148-199, 200-284, 285-344, 345-400, 401-502, 503-575, 576-716, 717-818,819-955, 956-1010, 1011-1099, 1100-1194, 1195-2114, 1195-2354, but alsothe fragments corresponding to nucleotides 1-199, 1-284, 1-344, 1-400,1-502, 1-575, 1-716, 1-818, 1-955, 1-1010, 1-1099, 1-1194, 1-2114,39-199, 39-284, 39-344, 39-400, 39-502, 39-575, 39-716, 39-818, 39-955,39-1010, 39-1099, 39-1194, 39-2354, 60-199, 60-284, 60-344, 60-400,60-502, 60-575, 60-716, 60-818, 60-955, 60-1010, 60-1099, 60-1194,60-2354, 148-284, 148-344, 148-400, 148-502, 148-575, 148-716, 148-818,148-955, 148-1010, 148-1099, 148-1194, 148-2114, 148-2354, 200-344,200-400, 200-502, 200-575, 200-716, 200-818, 200-955, 200-1010,200-1099, 200-1194, 200-2114, 200-2354, 285-400, 285-502, 285-575,285-716, 285-818, 285-955, 285-1010, 285-1099, 285-1194, 285-2114,285-2354, 345-502, 345-575, 345-716, 345-818, 345-955, 345-1010,345-1099, 345-1194, 345-2114, 345-2354, 401-575, 401-716, 401-818,401-955, 401-1010, 401-1099, 401-1194, 401-2114, 401-2354, 503-716,503-818, 503-955, 503-1010, 503-1099, 503-1194, 503-2114, 503-2354,576-818, 576-955, 576-1010, 576-1099, 576-1194, 576-2114,576-2354,717-955, 717-1010, 717-1099, 717-1194, 717-2114, 717-2354,819-1010, 819-1099, 819-1194, 819-2114, 819-2354, 956-1099, 956-1194,956-2114, 956-2354, 1011-1194, 1011-2114, 1011-2354, 1100-2114, or1100-2354.

It is important to note that the fragments according to the inventionare preferably not entirely comprised between nucleotides 158-609,607-660, or 29-537 of SEQ ID No 1. Indeed, these nucleotides correspondto EST that have been disclosed in GenBank, under accession numbersAA306797 (nucleotides 29-537) and AA315885 (nucleotides 158-609,607-660). These EST disclosures are nevertheless incomplete, as AA306797comprises a “N” in position 91 that makes it unclear, as it representsall 4 different nucleotides (the actual nucleotide is a “C”, as seen atposition 119 of SEQ ID No 1). Furthermore, AA306797 starts before thefirst methionin as identified in the present invention (nucleotide 39 ofSEQ ID No 1), which does not suggest the actual start codon of theprotein of the invention. EST AA315885 possesses a supplemental “C” ascompared to the sequence of the invention at nucleotide 453 of AA315885(corresponding to nucleotide 610 of SEQ ID No 1).

The disclosures of AA306797 and AA315885 have been used by Dronkert etal. (2000, Mol. Cell. Biol., 20, 4553-61) to obtain (after translationof the EST) part of the human SNM1c protein that is partly homologous tothe murine SNM1 object of the disclosure. The problems in the two ESTthat are mentioned above led to a mistaken translated protein.

It is also worth noting that Wood et al. (2001, Science, 291, 1284-9)mention the gene EST corresponding to SNM1C, as being located onchromosome 10, but do not precise the localization (10p), nor do theygive any other information than EST AA315885, that was shown to beerroneous.

It is also preferred if the fragments, or the sequence that iscomplementary to them, according to the invention, are not comprised intheir entirety, between nucleotides 1 to 35 and 37 to 189 of SEQ IDNo 1. These nucleotides have been disclosed in EST disclosed in GenBankunder accession number AA278590, that is incomplete and erroneous, as itmisses the “G” nucleotide present on position 36 of SEQ ID No 1.

It is also preferred if the fragments, or the sequence that iscomplementary to them, according to the invention, are not comprised intheir entirety, between nucleotides 1848 to 2321. These nucleotidescorrespond to some nucleotides present in EST disclosed in GenBank underaccession number AI859962, that is incomplete. This EST discloses anerroneous part of the complementary sequence of SEQ ID No1, thenucleotides complementary to the nucleotides present at the start of ESTAI859962 do not correspond to the last nucleotides of SEQ ID No1.

This EST contains, in part EST AA278850, that is also incomplete anderroneous (mismatch of a nucleotide as compared with the complement ofSEQ ID No 1).

Therefore, these disclosures are not only erroneous and incomplete, butdo also not link the nucleic acid of the invention, nor fragments asdefined above, to the V(D)J recombination and the SCID deficiency, asdid the inventors in the present application. This would have confusedthe person skilled in the art, which would not have been able to getmuch information from these disclosures.

The present invention is also drawn to a vector comprising the nucleicacid molecule of the invention, especially the nucleic acid sequencecorresponding to nucleotides 39-2114, 60-2114, 39-1193, or 60-1193 ofSEQ ID No 1. Numerous vectors are known in the art and they may be, as amatter of examples, expression vectors, or amplification vectors.

The invention is also drawn to a host cell comprising the vectoraccording to the invention.

The invention is also drawn to a process of producing a protein involvedin V(D)J recombination and/or DNA repair comprising the steps of:

-   -   a) expressing the nucleic acid molecule according to the        invention in a suitable host to synthesize a protein involved in        V(D)J recombination and/or DNA repair and    -   b) isolating the protein involved in V(D)J recombination and/or        DNA repair.

The present invention is also drawn to an isolated nucleic acid moleculethat is the complement of the isolated nucleic acid molecule of theinvention, as previously defined, and to an isolated protein or peptidecoded by the nucleic acid of the invention. As seen below, the proteinor peptide of the invention may be obtained either through recombinantDNA, chemical, or other techniques.

The terms polypeptide and protein are to be understood as meaning aspecific string of amino-acids that may be natural or synthetic. Theperson skilled in the art is aware of ways of varying amino-acids.Preferred proteins or polypeptides are especially SEQ ID No 2, andamino-acids 8-692, 1-385 or 8-385 of SEQ ID No 2.

The expression vectors of the invention contain preferably a promoter,traduction initiation and termination signals, as well as appropriateregions for regulating transcrition. They need to be maintained in thehost cell. The person skilled in the art is aware of such vectors and ofthe ways to produce and purify proteins, especially by using labels(like Histidine Tag, or glutathione). It is also possible to use invitro translation kits that are widely available, to produce the proteinor peptide according to the invention.

The nucleic acid of the invention or fragment thereof can be insertedinto appropriate expression or amplification vector using standardligation techniques, or homologous recombination. The vector is chosento allow amplification of the nucleic acid of the invention and/orexpression of the gene.

The vectors may be chosen as to be functional in a large variety ofhosts, such as prokaryotic, yeast, insect (baculovirus systems) and/oreukaryotic host cells. Selection of the host cell may depend on theproperties of the polypeptide or fragment thereof to be expressed, forexample if post-traductional modifications are needed (such asglycosylation and/or phosphorylation). If so, eucaryotic cells, such asyeast, insect, or mammalian host cells are preferable.

The person skilled in the art is aware of different types of vectorsthat may be used, and of the different techniques employed to make thementer the host cells (electroportation, lipofection . . . ).

Reviews of the different usable vectors, host cells and regulatory DNAsequences on the vectors, depending of the host cells may be found inU.S. Pat. No. 6,165,753, in particular columns 9, line 34 to column 13,line 36. This technical part of the document, that relates to the cyclinE2 gene and polypeptide but may be generalized to any other cDNA orgene, is incorporated herein by reference.

Another review of the different vectors, regulatory sequences, hostcells, and methods of expression of polypeptides that may be used can befound in WO 99/55730, page 6, line 3 to page 25, line 6, which is hereinincorporated by reference.

In summary, the vectors of the invention contain at least one selectablemarker gene that encodes a protein necessary for the survival and growthof the host cell in a selective culture medium. Typical selection markergenes encode proteins that either confer resistance to antibiotics suchas ampicillin, tetracycline, or kanamycin for prokaryotic host cells,complement auxotrophic deficiencies of the cell (use of ura-yeasts, forexemple). Preferred selectable markers are the kanamycin resistancegene, the ampicillin resistance gene, and the tetracycline resistancegene. The kanamycine resistance gene is preferred when vectors activeboth in procaryotic and eucaryotic (mammalian) cells are used, as theprotein gives resistance to neomycine in mammalian cells.

The vectors of the invention also contain a ribosome binding sequencesuch as a Shine-Dalgarno, a Kozak sequence, or an internal ribosomeentry site (Ires).

A signal sequence may be also be used to direct the polypeptide out ofthe host cell where it is synthesized. Many signal sequences are knownby persons skilled in the art, and any of them that are functional(homologous or heterologous) in the selected host cell may be used inconjunction with the nucleic sequence according to the invention.

The vectors of the invention are typically derived from a startingvector such as a commercially available vector. Such vectors may or maynot contain some of the elements to be included in the completed vector,such elements being introduced by appropriate molecular biologytechniques. Thus, the added elements may be individually introduced withthe vector after enzymatic digestion and ligation of the element. Thisprocedure is well known in the art and is described for example inSambrook et al., supra

Preferred vectors for which it is possible to start to obtain thevectors of the invention are compatible with bacterial, insect, andmammalian host cells, and include, without being limitative, pCRII,pCR3, and pcDNA3 (Invitrogen Company, San Diego, Calif.), pBSII(Stratagene Company, LaJolla, Calif.), pET15b (Novagen, Madison, Wis.),PGEX (Pharmacia Biotech, Piscataway, N.J.), pEGFP-N2 (Clontech, PaloAlto, Calif.), PETL (BlueBacII; Invitrogen), and pFastBacDual(Gibco/BRL, Grand Island, N.Y.).

After construction of the vector and insertion of the nucleic acid ofthe invention, the completed vector may be inserted into a suitable hostcell for amplification and/or polypeptide expression.

Such hosts cells may be prokaryotic host cells (such as E. coli,Bacillus subtilis) or eukaryotic host cells (such as a yeast cell, aninsect cell, or a vertebrate cell). The host cell, when cultured underappropriate conditions, can synthesize the polypeptide which cansubsequently be collected from the culture medium (if the host cellsecretes it into the medium) or directly from the host cell producing it(if it is not secreted). After collection, the polypeptide can bepurified using methods such as molecular sieve chromatography, affinitychromatography, and the like.

Selection of the host cell for the polypeptide production will depend inpart on whether the polypeptide is to be glycosylated or phosphorylated(in which case eukaryotic host cells are preferred), and the manner inwhich the host cell is able to “fold” the protein into its nativetertiary structure (e.g., proper orientation of disulfide bridges, etc.)such that biologically active protein is prepared by the cell. When thehost cell does not synthesize the polypeptide that has the biologicalactivity of V(D)J recombination and/or DNA repair, the polypeptide maybe “folded” after purification, upon various dialysis.

Suitable mammalian cells or cell lines are well known in the art andinclude Chinese hamster ovary cells (CHO), HeLa, HEK293, Hep-2, 3T3cells, monkey COS-1 and COS-7 cell lines, and the CV-1 cell line, mouseneuroblastoma N2A cells, HeLa, mouse L-929 cells, 3T3 lines derived fromSwiss, Balb-c or NIH mice, BHK or HaK hamster cell lines.

Useful bacterial host cells suitable for the present invention includethe various strains of E. coli, such as HB101, DH5.α.,DH10, or MC1061.Various strains of B. subtilis, Pseudomonas spp., other Bacillus spp.,Streptomyces spp., and the like may also be employed in this method.

Many strains of yeast cells known to those skilled in the art are alsoavailable as host cells for expression of the polypeptides of thepresent invention, and one will prefer strains of Saccharomycescerevisiae, S. pombe, Kluyveromyces . . .

Entry (also referred to as “transformation” or “transfection”) of thevector into the selected host cell may be achieved using such methods ascalcium chloride, electroporation, microinjection, lipofection or theDEAE-dextran method. The method selected will in part be a function ofthe type of host cell to be used. These methods and other suitablemethods are well known to the skilled artisan, and are set forth, forexample, in Sambrook et al., supra.

The person skilled in the art is aware of the medium to use forcultivation of the host cells, including the selection agent(antibiotic) to add in said medium, depending on the marker that is onthe vector inserted into the cell.

The amount of polypeptide produced in the host cell can be evaluatedusing standard methods known in the art, such as, without limitation,Western blot analysis, SDS-polyacrylamide gel electrophoresis,non-denaturing gel electrophoresis, HPLC separation,immunoprecipitation, and/or activity assays such as DNA binding gelshift assays.

Purification of the polypeptide produced by cells of the invention, whenpresent in solution (secretion or excretion) can be accomplished using avariety of techniques. If it contains a tag such as Hexahistidine, itmay essentially be purified in a one-step process by passing thesolution through an affinity column where the column matrix has a highaffinity for the tag or for the polypeptide directly (Nickel column forpoly-histidine).

Where the polypeptide is prepared without a tag attached, other wellknown procedures for purification can be used. Such procedures include,without limitation, ion exchange chromatography, molecular sievechromatography, HPLC, native gel electrophoresis in combination with gelelution, and preparative isoelectric focusing (“Isoprime”machine/technique, Hoefer Scientific). In some cases, two or more ofthese techniques may be combined to achieve increased purity.

If it is anticipated that the polypeptide will be found primarilyintracellularly, the intracellular material (including inclusion bodiesfor gram-negative bacteria) can be extracted from the host cell usingany standard technique known to the person skilled in the art. Forexample, the host cells can be lysed to release the contents of theperiplasm/cytoplasm by French press, homogenization, and/or sonicationfollowed by centrifugation.

If the polypeptide has formed inclusion bodies in the periplasm, thepolypeptide is obtained using methods known in the art, especially withthe aid of chaotropic agents such as guanidine or urea to release, breakapart, and solubilize the inclusion bodies. Further dialysis will helpsolubilizing the polypeptide.

One can also use other standard methods well known to the skilledartisan, such as separation by electrophoresis followed byelectroelution, various types of chromatography (immunoaffinity,molecular sieve, and/or ion exchange), and/or high pressure liquidchromatography. In some cases, it may be preferable to use more than oneof these methods for complete purification.

In addition to preparing and purifying the polypeptide using recombinantDNA techniques, the polypeptides, fragments, and/or derivatives thereofmay be prepared by chemical synthesis methods (such as solid phasepeptide synthesis) using techniques known in the art such as those setforth by Merrifield et al., (J. Am. Chem. Soc., 85:2149 [1963]),Houghten et al. (Proc Natl Acad. Sci. USA, 82:5132. [1985]), and Stewartand Young (Solid Phase Peptide Synthesis, Pierce Chemical Co., Rockford,Ill. [1984]). Such polypeptides may be synthesized with or without amethionine on the amino terminus. Chemically synthesized polypeptides orfragments may be oxidized using methods set forth in these references toform disulfide bridges. These synthetic polypeptides or fragments areexpected to have biological activity comparable to the polypeptidesproduced recombinantly or purified from natural sources, namely anability to promote V(D)J recombination and/or DNA repair and thus may beused interchangably with recombinant or natural polypeptide.

The polypeptide of the invention may be chemically derived, orassociated with a polymer. The modified polypeptides according to theinvention may have different pharmacological properties than theunmodified polypeptides, such as an increased or a decreased half-timeafter administration to a animal or human, different pharmacokinetics .. .

The polypeptides according to the invention, their fragments, variants,and/or derivatives may be used to prepare antibodies using standardmethods, for example after administration to an animal such as a mouse,a rat, a rabbit or a goat using an appropriate adjuvant (in particularFreund's Complete or Incomplete adjuvant).

Thus, antibodies that react with the polypeptides of the invention, aswell as reactive fragments of such antibodies, are also encompassedwithin the scope of the present invention. The antibodies may bepolyclonal, monoclonal, recombinant, chimeric, single-chain and/orbispecific. In preferred embodiment, the antibody or fragment thereofwill either be of human origin, or will be “humanized”, i.e., preparedso as to prevent or minimize an immune reaction to the antibody whenadministered to a patient.

The antibody fragment may be any fragment that is reactive with thepolypeptides of the present invention the invention also encompasses thehybridomas generated by presenting the polypeptide according to theinvention or a fragment thereof as an antigen to a selected mammal,followed by fusing cells (e.g., spleen cells) of the mammal with certaincancer cells to create immortalized cell lines by known techniques, suchas the technique of Köhler et Milstein (1975 Nature 256, 495).

The antibodies according to the invention are, for example, chimericantibodies, humanized antibodies, Fab ou F(ab′)₂ fragments. They may beimmunoconjugates or labeled antibodies.

The inventors of the present invention have demonstrated that theArtemis gene (also named SNM1C) codes for a novel V(D)J recombinationand/or DNA repair factor that belongs to the metallo β-lactamasesuperfamily and whose mutations give rise to the human RS-SCIDcondition.

Therefore, the present invention is also drawn to a method for thedetermination of the type of SCID in a patient comprising the step ofanalyzing the nucleic acid chosen in the group consisting of SEQ ID No1, nucleotides. 39-2114, 60-2114, 39-1193, and 60-1193 of SEQ ID No 1 insaid patient, a mutation in said nucleic acid allowing theclassification of said SCID as radiosensible SCID.

The mutations that are searched comprise point mutations that lead to anon-conservative change in an amino-acid of the protein or production ofa premature termination codon, deletions, insertions, or modificationsdue to changes in the genomic DNA corresponding to SEQ ID No 1, and thesplice donor and acceptors sites flanking the exons.

The invention is also drawn to a method of diagnosis in a patient,including a prenatal diagnosis, of a condition chosen in the groupconsisting of a SCID, a predisposition to cancer, a immune deficiency,and the carriage of a mutation increasing the risk of progeny to havesuch a disease, comprising the step of analyzing the nucleic acid chosenin the group consisting of SEQ ID No 1, nucleotides 39-2114, 60-2114,39-1193, and 60-1193 of SEQ ID No 1 and/or the protein coded by saidnucleic acid in said patient, a mutation in said nucleic acid and/orprotein indicating a increased risk of having said condition.

It is also encompassed that the method of diagnosis according to theinvention is performed such as to analyze one or the two alleles of thepatient, looking for any mutations (point mutations, deletions,insertions, mutations leading to incomplete splicing . . . ) that willlead to the production of a non functional protein. Therefore, it has tobe understood that the analysis of the nucleic acid chosen in the groupconsisting of SEQ ID No 1, nucleotides 39-2114, 60-2114, 39-1193, and60-1193 of SEQ ID No 1 is performed directly or indirectly on thegenomic DNA.

These methods of diagnosis are preferably performed in vitro, on DNA orRNA that have been obtained from cells harvested from the patient.

The methods of diagnosis according to the nvention may be performed onthe genomic DNA isolated from the patient, for example by amplificationof the exons, in particular with the pairs of primers SEQ ID No 5 to SEQID No 32, which are located in the introns of the Artemis gene, andallow the amplification of the exons and of the junctions intron-exons,which the inventors have shown as mutated in some RS-SCID patients. Itis clear that any primer that would amplify the exons or allow theanalysis of the exon-intron junction may be used in this embodiment ofthe method of diagnosis according to the invention.

One way of performing a method of diagnosis according to the inventionwould be to perform a PCR-SSCP, or to sequence the amplification productto determine the mutations in the Artemis gene.

The protein ARTEMIS (or SNM1C) coded by the nucleic acid of theinvention is therefore involved in V(D)J recombination and/or DNArepair. Mutations in the nucleic acid leading to the expression of anon-functional protein, when occurring on both alleles in a patient,lead to a SCID condition in said patient. It is also foreseen that theproperties of the protein ARTEMIS may be exploited in other fields, suchas cancer therapy, as compounds that interact with the protein in acancerous cell may help sensitizing said cell to an antitumoral agent,the action of which is to shred the DNA (radiation, chemical agent).

The invention is therefore drawn to a method of identification of acompound capable of binding to the nucleic acid or to the protein of theinvention, comprising the steps of

-   -   a) contacting said nucleic acid or protein with a candidate        compound, and    -   b) determining the binding between said candidate compound and        said nucleic acid or protein.

The methods for assessing the binding of a compound to a nucleic acid ora protein are well known in the art, and are preferably performed invitro. A method to achieve such a goal may be to link the nucleic acidor the protein to a solid support on which the compound to test isflown, and to check the recovery of the compound after passage on thesupport. By adjusting parameters, it is also possible to determine thebinding affinity. The compounds may also be found by other methodsincluding FRET, SPA . . . when the compounds and the nucleic acid orprotein are labeled. The assay may also be performed on the cellscontaining the nucleic acid of the invention, for example on a vectoraccording to the invention, and/or expressing the protein according tothe invention. This also gives the information of the capacity of thecompound to go through the membrane and penetrate within cells. Thesecells can be bacterial cells (search for antibiotic compounds binding tothe β-lactamase fragment of the protein), or mammalian cells.

The invention is therefore also drawn to a compound identified by theabove-described method, said compound binding to the nucleic acid or theprotein of the invention.

Particularly preferred compounds are compounds that bind to theβ-lactamase region of the protein of the invention (first 180amino-acids of SEQ ID No 2) or the associated b-CASP domain (amino-acids181-385 of SEQ ID No 2). Such compounds may have a chemical formulaclose to the formula I or IV of WO 00/63213, said formula and thecorresponding part of the specification being incorporated herein byreference, WO 99/33850, WO 01/02411 or U.S. Pat. No. 6,150,350, thedescription of the compounds being incorporated herein by reference.

A compound identified by a method according to the invention may be acompound with a chemical backbone (chemical compound), a lipid, acarbohydrate (sugar), a protein, a peptide, an hybrid compoundprotein-lipid, protein-carbohydrate, peptide-lipid,peptide-carbohydrate, a protein or a peptide on which has been brancheddifferent chemical residues.

The foreseen chemical compounds (with a chemical backbone), may containone or more (up to 3 or 4) cycles, especially aromatic cycles, inparticular having from 3 to 8 atoms of carbon, and having all kinds ofbranched groups (in particular lower alky, i.e. having from 1 to 6 atomsof carbon, keto groups, alcohol groups, halogen groups . . . ). Theperson skilled in the art knows how to prepare different variants of acompound starting from a given backbone by grafting these radicals onsaid backbone.

The method of the invention also allows the screening, detection and/oridentification of compounds able to inhibit the biological activity ofthe ARTEMIS protein. Indeed, it is possible to test the compoundsbinding to the nucleic acid or the protein identified by the methodaccording to the invention on an assay such as a complementation assaywherein a vector carrying SEQ ID No 1 or nucleotides 39-2114 of SEQ IDNo 1 and expressing a functional protein is introduced in a cellharvested from a patient suffering from RS-SCID, and the compound istested on its ability to inhibit the restoration of V(D)J recombinationand/or DNA repair in said cells. The examples illustrate such an assaythat is preferably performed in vitro.

The present invention thus allows the detection, identification and/orscreening of compounds that may be useful for the treatment of diseaseswhere V(D)J recombination and/or DNA repair is involved. Nevertheless,the compounds identified by the method according to the invention, inorder to be used in a therapeutic treatment, may need to be optimized,in order to have a superior activity and/or a lesser toxicity.

Indeed, the development of new drugs is often performed on the followingbasis:

-   -   screening of compounds with the sought activity, on a relevant        model, by an appropriate method,    -   selection of the compounds that have the required properties        from the first screening test (here, modulation of V(D)J        recombination and/or DNA repair),    -   determination of the structure (in particular the sequence (if        possible the tertiary sequence) if they are peptides, proteins        or nucleic acids, formula and backbone if they are chemical        compounds) of the selected compounds,    -   optimization of the selected compounds, by modification of the        structure (for example, by changing the stereochemical        conformation (for example passage of the amino acids in a        peptide from L to D), addition of substituants on the peptidic        or chemical backbones, in particular by grafting groups or        radicals on the backbone, modification of the peptides (se in        particular Gante “Peptidomimetics”, in Angewandte        Chemie-International Edition Engl. 1994, 33.1699-1720),    -   passage and screening of the “optimized” compounds on        appropriate models that are often models nearer to the studied        pathology. At this stage, one would often use animal models, in        particular rodents (rats or mice) or dogs or non-human primates,        that are good the models of SCID or cancers, and to look for the        phenotypic changes in said models after administration of the        compound.

The present invention also encompasses the compounds that have beenoptimized after following the steps or equivalent steps as described.

The present invention is also drawn to one of the nucleic acid, thevector, the host cell, the protein, the antibody, and the compound ofthe invention as a medicament. These entities can indeed be used alonefor the treatment of different types of diseases, in particular the onesin which V(D)J recombination and/or DNA repair is involved, saiddiseases including, without limitation, RS-SCID, immune deficiency,cancer. These entities can also be used in combination with anothertreatment that is appropriate for the disease, the use beingsimultaneous, separate or sequential.

In particular, the products according to the invention may be used incombination with cytokines, growth factors, antibiotics,anti-inflammatories, and/or chemotherapeutic agents as is appropriatefor the indication being treated.

The invention is also drawn to a pharmaceutical composition comprising apharmaceutically acceptable excipient, carrier or diluent with at leastone of the nucleic acid, the vector, the host cell, the protein, theantibody, and the compound of the invention.

The person skilled in the art knows the appropriate excipients andcarriers that can be used, and one may cite, as ways of example, waterfor injection, preferably supplemented with other materials common insolutions for administration to mammals, or neutral buffered saline orsaline mixed with serum albumin. Some carriers may be found inRemington's Pharmaceutical Sciences, 18th Edition, A. R. Gennaro, ed.,Mack Publishing Company [1990].

The pharmaceutical composition of the invention may be administratedorally, nasally, mucosally, or injected in particular intravenously,intramuscularly, or subcutaneously. The carrier and/or excipient will bechosen appropriately depending on the way of administration. U.S. Pat.No. 6,165,753 already cited gives examples of suitable routes ofadministration and excipients (column 17 line 31 to column 21 line 55,the general information being incorporated herein by reference).

The invention is also directed to a method for the therapy of a severecombined immunodeficiency (SCID), comprising administering to a subjectat least one of the nucleic acid, the vector, the host cell, theprotein, the antibody, the compound and the pharmaceutical compositionof the invention.

U.S. Pat. No. 6,165,753 also describes methods of gene therapy, and thisteaching is incorporated by reference.

In a preferred embodiment, said nucleic acid is administered to saidsubject such as to enter stem cells, that is the cells that are thegenitors of the cells of the immune system. This may be performed invivo, or ex vivo, after harvesting the cells of the patient, selectingthe stem cells, introducing the nucleic acid within said selected cells,and reinjecting the transformed cells to the patient.

For penetration of the nucleic acid within the cells, different meansmay be used by the person skilled in the art. In particular, it ispossible to introduce said nucleic acid within the cells by means of aviral vector.

Said virus may be of human or of non-human origin, as long as itpossesses the capability to infect the cells of the patient. Inparticular, said virus is chosen from the group consisting ofadenoviridiae, retroviridiae (oncovirinae such as RSV, spumavirinae,lentivirus), poxyiridiae, herpesviridiae (HSV, EBV, CMV . . . ),iridiovirus, hepadnavirus (hepatitis B virus), papoviridiae (SV40,papillomavirus), parvoviridiae (adeno-associated virus . . . ),reoviridiae (reovirus, rotavirus), togaviridiae (arbovirus, alphavirus,flavivirus, rubivirus, pestivirus), coronaviriadiae, paramyxoviridae,orthomixoviridae, rhabdoviridae (rabies virus), bunyaviridae,arenaviridae, picornaviridae (enterovirus, Coxsackievirus, echovirus,rhinovirus, aphtovirus, cardiovirus, hepatitis A virus . . . ), ModifiedVirus Ankara, and derived viruses thereof.

By derived viruses, it is intended to mean that the virus possessesmodifications that adapt it to the human being (if it is a virus from anon-human origin that could not infect human cells without saidmodifications), and/or that reduce its potential or actualpathogenicity. In particular, it is best if the virus used for the genetransfer is defective for replication within the human body. This is animportant safety concern, as the control of the expression of thefunctional gene may be a concern for the implementation of the method ofthe invention. One does not either whish to have a dissemination toother cells or to other people of the viral vector carrying the gene oftherapeutic interest.

This is why the viral vector used in the method of the invention ispreferably deficient for replication, and would therefore be preparedwith the help of a auxiliary virus or in a complementary cell line, thatwould bring in trans the genetic material needed for the preparation ofa sufficient viral titer.

Such defective viruses and appropriate cell lines are described in theart, for example in U.S. Pat. No. 6,133,028 that describes deficientadeno-associated viruses (AAV) and the associated complementation celllines, and the content of which is herein incorporated by reference.Other suitable viruses are described for example in WO 00/34497. Foradenoviruses or AAV, it may be interesting to delete the E1 and/or E4regions.

For the MFG virus described below, one can use the complementationΨ-CRIP cell line that was described in Hacein-Bey et al. (1996, Blood.87, 3108-16), incorporated herein by reference. Other appropriate celllines could also be used.

In order to improve the long lasting effect of the correction, one wouldprefer a virus that allows the integration of said functional gene intoa chromosome of the infected cells.

In particular, one would chose adenoviruses, some of which defective forreplication are well know by the person skilled in the art, orretroviruses, in particular murine derived retroviruses. Among theretroviruses that can be used, one would prefer a myeloproliferativesarcoma virus (MPSV)-based vector as described in Bunting et al. (1998,Nature Medecine, 4, 58-64, the content of which is incorporated hereinby reference). Another well suited retrovirus that can be used for theimplementation of the method of the invention is the MFG vector, derivedfrom the MLV virus (Moloney retrovirus), described in Hacein-Bey et al.(1996, Blood. 87, 3108-16) or Cavazzana-Calvo et al. (2000, Science,288, 669-72), the content of both these documents being incorporatedherein by reference.

The choice of the virus to be used for the implementation of the methodof the invention will be function of the characteristics of said virusand of the complementation cell line. It is clear that different viruseshave different properties (in particular LTR in retroviruses), and thatthe viruses and cell lines cited above are only examples of means thatcan be used for the implementation of the method of the invention, andthat they shall not be considered as restrictive. The person skilled inthe art knows how to choose the best combination gene-virus-cell lineand/or auxiliary virus for any given situation.

In another embodiment, said nucleic acid is introduced within cells bymeans of a synthetic vector which can be chosen from the groupconsisting of a cationic amphiphile, a cationic lipid, a cationic orneutral polymer, a protic polar compound such as propylene glycol,polyethylene glycol, glycerol, ethano, 1-methyl-L-2-pyrrolidone or theirderivatives, and an aprotic polar compound such as dimethyl sulfoxide(DMSO), diethyl sulfoxide, di-n-propyl sulfoxide, dimethyl sulfone,sulfolane, dimethylformamide, dimethylacetamide, tetramethylurea,acetonitrile or their derivatives. The person skilled in the art isaware of synthetic vectors that can be used and allow a high level oftransfection, such as Lifofectine and Lipofectamine reagents availablefrom Life Technologies (Bethesda, Md.).

It is envisioned that the expression of the functional Artemis genewithin the cells of the SCID patient will bring a selective advantage tosaid cells or progeny of said cells, as compared to non-transformedcells or progeny of said non-transformed cells, and lead to aimprovement in the condition of the patient.

As selective advantage, it is understood that the proportion of cellsthat have been corrected by introduction of the functional gene comparedto the cells of the same type in which said gene is not functional willincrease, and that it would lead to a alleviation of the disease, oreven a cure.

The possibility to obtain such a selective advantage has beendemonstrated for another gene by Cavazzana-Calvo et al. (2000, Science,288, 669-72).

The method of the invention is best performed when the cells withinwhich is introduced the functional Artemis gene are stem cells orundifferentiated cells that are progenitors of the cells lackingexpression of the functional Artemis gene. This would lead to the factthat a large number of cells are corrected within time, as the progenyof said stem cells would also be corrected. Furthermore, as the stemcells would differentiate over time to a large number of cells, there isonly the need to transfect a small initial number of cells.

It is possible to identify some of the stem cells for the haematopoieticsystem, as they harbor the marker CD34 at their surface (CD34+ cells).Their targeting may be performed in vivo, or ex vivo (sorting of thesecells, transfection and reinfusion by intravenous injection). The personskilled in the art knows that stem cells can be isolated from cordblood.

It may be interesting to induce the cell cycle of said stem cells, inorder to increase the efficiency of transfection. This is particularlytrue when the method of the invention is performed on cells ex vivo.

Indeed, some of the vectors that can be used for the entry of the geneof interest in the cells are vectors that can only be incorporated innon-quiescent cells. In order to have the cells replicate, one willpreferably use growth factors and/or cytokines, such as CSF (GM-CSF,M-CSF, G-CSF), interleukines (preferably IL-1, 2, 3, 4, 6, 7, 8, 9, 10,15), interferons (in particular α or γ). One could also use stem cellfactor, megakaryocyte differentiation factor, optionally coupled withpolyethylene-glycol, or Flt-3-L. These factors may be use alone or incombination.

When the method of the invention is performed ex vivo, the stem cellsmay also be maintained in a medium complemented with other nutrimentssuch as serum (fetal bovine serum, as usual, or preferably fetal cellserum). It may also be interesting to maintain the cells to betransfected in plates containing cell adhesion elements such as celladhesion proteins (in particular fibronectin, or vitronectin) or thepeptides that have been shown to promote cell adhesion. It is clear thatthe choice of the culture medium will be influenced by the type of cellsand the sought-after goal.

The invention is also drawn to a method for the therapy of cancer in apatient, comprising administering to said patient at least one of thenucleic acid, the vector, the host cell, the protein, the antibody, thecompound and the pharmaceutical composition of the invention, preferablyin addition with a genotoxic agent, the use being simultaneous, separateor sequential.

The invention also relates to methods of modulation of the expressionand/or activity of the protein ARTEMIS in a cell, comprising contactingsaid cell with at least a compound selected in the group consisting ofthe nucleic acid, the vector, the host cell, the protein, the antibody,the compound and the pharmaceutical composition of the invention. Thismodulation may be performed in vitro, but also in vivo, in any mammal,the compound being appropriately formulated. Interesting compoundsinclude the compounds of the invention that bind to the β-lactamase orthe β-CASP fragment of the protein, but also compounds that have anantisense activity.

Antisenses specifically bind to the cDNA and interfere with thetranslation of the protein, possibly by inducing the digestion of saidcDNA by RNase H. The antisenses envisioned in the present invention maybe modified and may harbor phosphorothioates, or methylphosphonatesbonds. They may also capped at one extremity, in order to reduce theirdegradation by nucleases.

As the protein ARTEMIS harbors a β-lactamase fragment, the inventionalso encompasses a method of biotherapy in a patient comprisingadministering to said patient at least one of the nucleic acid, thevector, the host cell, the protein, the antibody, the compound and thepharmaceutical composition of the invention. It is particularlypreferred to use the compound of the invention that binds to theβ-lactamase part of the ARTEMIS protein.

Another object of the invention is the use of at least one of thenucleic acid, the vector, the host cell, the protein, the antibody, thecompound and the pharmaceutical composition of the invention for thepreparation of a medicament intended for the treatment of a diseasewhere V(D)J recombination and/or DNA repair is involved, said diseasebeing in particular chosen in the group consisting of cancer, SCID,especially RS-SCID, immune deficiency, for example due to problems inthe switch of the heavy chains of the immunoglobulins, or due toproblems in the process of somatic hypermutation of the immunoglobulins.

The invention also relates to a transgenic non-human mammal havingintegrated into its genome the nucleic acid sequence of the invention,especially a nucleic acid sequence chosen in the group consisting ofnucleotides 39-2114, 60-2114, 39-1193 and 60-1193 of SEQ ID No 1,operatively linked to regulatory elements, wherein expression of saidcoding sequence increases the level of the ARTEMIS protein and/or thelevel of V(D)J recombination and/or DNA repair of said mammal relativeto a non-transgenic mammal of the same species.

It is preferred when the nucleic acid sequence that is integrated in thegenome of the transgenic animal codes for a polypeptide chosen in thegroup of polypeptides comprising amino-acids 1-692, 8-692, 1-385 or8-385 of SEQ ID No 2. of SEQ ID No 2. It is also preferred when thenucleic acid sequence that is integrated in the genome of the transgenicanimal codes for a polypeptide chosen in the group of polypeptidesconsisting of amino-acids 1-692, 8-692, 1-385 or 8-385 of SEQ ID No 2.

It is also envisioned that the regulatory elements (promoters,enhancers, introns, similar to those that can be used in mammalianexpression vectors) may be tissue-specific, which allows over-expressionof the ARTEMIS protein only in a specific type of cells. In particular,the person skilled in the art is aware of the different promoters thatcan be used for this purpose.

The insertion of the construct in the genome of the transgenic animal ofthe invention may be performed by methods well known by the artisan inthe art, and can be either random or targeted. In a few words, theperson skilled in the art will construct a vector containing thesequence to insert within the genome, and a selection marker (forexample the gene coding for the protein that gives resistance toneomycine), and may have it enter in the Embryonic Stem (ES) cells of ananimal. The cells are then selected with the selection marker, andincorporated into an embryo, for example by microinjection into ablastocyst, that can be harvested by perfusing the uterus of pregnantfemales. Reimplantation of the embryo and selection of the transformedanimals, followed by potential back-crossing allow to obtain suchtransgenic animal. To ibtain a “cleaner” animal, the selection markergene may be excised by use of a site-specific recombinase, if flanked bythe correct sequences.

The invention also relates to a transgenic non-human mammal whose genomecomprises a disruption of the endogenous ARTEMIS gene, wherein saiddisruption comprises the insertion of a selectable marker sequence, andwherein said disruption results in said non-human mammal exhibiting adefect in V(D)J recombination and/or DNA repair as compared to awild-type non-human mammal.

In a preferred embodiment, said disruption is a homozygous disruption.

In a preferred embodiment, said homozygous disruption results in a nullmutation of the endogenous gene encoding ARTEMIS.

In a preferred embodiment, said mammal is a rodent, in the mostpreferred embodiment, said rodent is a mouse.

The invention also encompasses an isolated nucleic acid comprising anARTEMIS knockout construct comprising a selectable marker sequenceflanked by DNA sequences homologous to the endogenous ARTEMIS gene,wherein when said construct is introduced into a non-human mammal or anancestor of said non-human mammal at an embryonic stage, said selectablemarker sequence disrupts the endogenous ARTEMIS gene in the genome ofsaid non-human mammal such that said non-human mammal exhibits a defectin V(D)J recombination and/or DNA repair as compared to a wild typenon-human mammal.

Said construct is used to obtain the animals that have the disruptedcopy of the Artemis gene, and are generally carried on a vector that isalso an object of the invention.

The invention also relates to a mammalian host cell whose genomecomprises a disruption of the endogenous Artemis gene, wherein saiddisruption comprises the insertion of a selectable marker sequence.Preferably, said disruption is homozygous and leads to a non-expressionof a functional ARTEMIS protein (or expression of a non-functionalprotein).

It is to be noted that the disruption may be obtained by methods knownin the art and may be conditional, i.e. only present in specific typesof cells, or induced at some moments of the development. The method toachieve such a goal may be to use site specific recombinases such as Cre(recognizing lox sites) or FLP (recognizing FRT sites) recombinases,under the control of cell-specific promoters. These recombinases(especially Cre) have been shown to be suitable for modifications andtheir activity may be induced by injection of a substrate (such as anhormone). These modifications are known in the art and may be found, forexample in Shibata, et al. (1997, Science 278, 120-3).

Therefore, the transgenic animal or the cell of the invention may notshow anymore the selectable marker, which may have been removed uponaction of the recombinases, that lead to the disruption of the gene.Nevertheless, in the process of obtaining such disruption, a selectablemarker has been inserted within the Artemis gene, mostly to allowselection of the transformed cells.

U.S. Pat. No. 6,087,555 describes one way of obtaining a knock-outmouse, and the general teaching of this patent is incorporated herein byreference (column 5, line 54 to column 10 line 13). In this patent, itis described an OPG knock-out mouse, but the same method applies to anARTEMIS knockout mouse. The person skilled in the art will also findinformation in Hogan et al. (Manipulating the Mouse Embryo: a LaboratoryManual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.;1986).

In another embodiment, the invention relates to a non human transgenicmammal whose genome comprises a first disruption that is of theendogenous ARTEMIS gene, and a second disruption that is of theendogenous p53 gene. In the preferred embodiment, at least said firstdisruption or said second disruption is an homozygous disruption. In themost preferred embodiment, said first disruption and said seconddisruption are both homozygous disruptions, which result in particularin null inactivation of both ARTEMIS and p53 genes.

In the preferred embodiment, the transgenic mammal is a rodent,preferably a rat or a mouse.

This transgenic mammal may be used as a model for studying pro-B celllymphomas. Indeed, the invention relates to a method for providing amodel for studying pro-B cell lymphomas, comprising providing saidtransgenic mammal to a person in need of such model for studying pro-Bcell lymphomas, and to a method for studying pro-B cell lymphomas,comprising studying said transgenic mammal.

The animals and “knock-out” cells of the invention may also be used foridentification of pharmacologically interesting compounds. Therefore,the invention also relates to a method of screening compounds thatmodulate V(D)J recombination and/or DNA repair comprising contacting acompound with the non-human mammal or the knock-out host cell of theinvention, and determining the increase or decrease of V(D)Jrecombination and/or DNA repair into said non-human mammal or said hostcell as compared to the V(D)J recombination and/or DNA repair of saidnon-human mammal or said host cell prior to the administration of thecompound.

A method of testing the genotoxicity of compounds comprising contactinga compound with the non-human mammal or a host cell of the invention,and determining the increase or decrease of V(D)J recombination and/orDNA repair into said non-human mammal or said host cell as compared tothe V(D)J recombination and/or DNA repair of said non-human mammal orsaid host cell prior to the administration of the compound, is alsoobject of the invention.

The present invention also relates to the determination of the bindingpartners of the proteins coded by one of the nucleic acid of theinvention by using the double hybrid assay, for example the assaydescribed by Finley and Brent (Interaction trap cloning with yeast,169-203, in DNA Cloning, Expression Systems: a practical Approach, 1995,Oxford Universal Press, Oxford, the content of which is incorporatedherein by reference).

Basically, a yeast strain is transformed by two plasmids encoding eitherthe bait protein (the protein coded by one of the nucleic acid of theinvention) or the protein supposed to be a binding partner of the baitprotein (the prey protein).

Upon binding of the 2 proteins, a reporting gene is induced and theyeast becomes able to metabolize a substrate in the medium. It is thuspossible to determine the binding between two proteins. It is very quickto use a cDNA library in order to screen multiple preys at the sametime.

The double hybrid assay is not limited to yeast, but may also beperformed in other types of cells, such as mammal cells or bacterialcells.

The invention also relates to the complexes that are made of a proteincoded by one of the nucleic acid of the invention and one of its bindingpartners.

Function of Artemis

Although it can inferred from the phenotype of RS-SCID patients thatArtemis is part of the ubiquitous machinery involved in DNA dsb repairthat is shared by the V(D)J recombination process, the precise functionfor this new factor remains to be defined.

The repeated search for a global homolog of Artemis in other species inprotein databases failed to provide a strong candidate. The onlysimilarity comes from the 0083 peptide, which includes the wholeN-terminus moiety of Artemis, to the various members of the SNM1 family.

However, Artemis is clearly not the human ortholog of either murine SNM1or yeast PSO2 for at least two reasons:

-   -   First, despite their SNM1 similarity regions, the three proteins        differ in their associated domains. In particular, the 331 amino        acids composing the C-terminal region of Artemis are not present        in SNM1/PSO₂ and do not show any obvious similarities with any        other known protein.    -   Secondly, while murine and yeast SNM1/PSO₂ mutants demonstrate a        strong defect in the repair of DNA damages caused by DNA        interstrand cross-linking agents (Henriques and Moustacchi,        1980; Dronkert et al., 2000), they do not display elevated        sensitivity to ionizing radiations, indicating that these two        proteins are probably not directly involved in the repair of DNA        dsb.

This is in sharp contrast to the phenotype of RS-SCID patients whoseprimary molecular defect is indeed the absence of DNA dsb repair,illustrated by the lack of coding joint formation in the course of V(D)Jrecombination and the increased sensitivity of bone marrow andfibroblast cells to γ rays (Cavazzana-Calvo et al., 1993; Nicolas etal., 1998).

Interestingly, Artemis, murine SNM1 and yeast PSO₂ share a domainadopting a metallo-β-lactamase fold (Aravind, 1997) and probably itsassociated enzymatic activity, given the presence of nearly all thecritical catalytic residues (or conserved residues which couldsubstitute them for function). However, there is no obvious consensuswith regard to the nature of the various metallo-β-lactamase substrates,outside of a general negatively charged composition. Sequence analysisrevealed the existence of a conserved region that accompanies themetallo-β-lactamase domain in members of the Artemis/SNM1/PSO₂ subfamily(data not shown), including various other sequences related to nucleicacid metabolism such as two subunits of the cleavage and polyadenylationspecificity factor (CPSF).

It is proposed to name this domain, which will be described in detailelsewhere, βCASP for metallo-β-lactamase associated CPSF ArtemisSNM1/PSO₂ domain. This domain, although highly divergent and toleratingmultiple insertions (e.g. within yeast PSO₂), harbors several conservedresidues, such as the H319 in Artemis, which could play a role in thereaction catalyzed by members of this subfamily.

It is tempting to speculate that this domain could contribute tosubstrate binding, in a similar way as the α-helical domain ofglyoxalase, another member of the β-lactamase family (Cameron et al.,1999).

Artemis in the NHEJ Pathway of DNA Repair

DNA dsb can be repaired by either homologous recombination (HR) or bythe non-homologous end-joining pathway (NHEJ)(review in (Haber, 2000)).While HR is the predominant repair pathway in yeast, NHEJ is mostlyutilized in higher eukaryotes and represents the DNA repair pathwayfollowed during V(D)J recombination.

At least three protein complexes are thought to act in concert orsequentially at the site of the RAG1/2 derived dsb. The Ku70-80 complexis probably recruited first at the site of the lesion, followed by theaddition of the DNA-PKcs subunit. This initial complex is considered asthe primary DNA damage sensor that will activate the DNA repairmachinery. The XRCC4/DNA-LigaseIV represents the best candidate toactually repair the gap. More recently, the RAD50/E11/NBS1 complex whichwas known to participate in NHEJ (Carney et al., 1998; Varon et al.,1998) was found at the site of the rearranging TCR genes, arguing forits possible involvement in the DNA repair phase of the V(D)Jrecombination (Chen et al., 2000).

One would like to know of course how Artemis plays its role in thiscascade. At this point only speculative answers based upon the analogyof phenotypes between the various deficient models including the RS-SCIDcan be provided. It is most unlikely that Artemis be linked to theRAD50/MRE11/NBS1 complex. Indeed, in both Nijmegen andataxia-telangectasia-like disorder (ATLD) patients, mutations in theNBS1 and the MRE11 genes lead to chromosomal instability accompanied bya profound defect in DNA damage induced G1 arrest of the cell cycle(G0/G1 checkpoint) while V(D)J recombination is spared (Carney et al.,1998; Varon et al., 1998). This is in sharp contrast to the normal G1arrest in RS-SCID fibroblasts following irradiation (unpublishedresults).

Concerning the XRCC4/DNA-ligaseIV complex, two major differences existbetween the RS-SCID condition and the XRCC4 and DNA-LigaseIV KO mice:

-   -   First, a complete null allele of Artemis (such as in P6, P15,        and P40) does not lead to embryonic lethality in humans. This        observation does not support an implication of Artemis in this        phase of NHEJ.    -   Second, the rejoining of linearized DNA constructs introduced in        RS-SCID fibroblasts is normal (unpublished results) while this        assay, when defective, is highly diagnostic of abnormal NHEJ in        yeast (Teo and Jackson, 1997; Wilson et al., 1997).

Perhaps the most evident link between Artemis and NHEJ is found inregard to the Ku/DNA-PK complex. Indeed, human RS-SCID patients and scidmice, which harbor a mutation in the DNA-PKcs encoding gene, are theonly two known conditions where a V(D)J recombination associated DNArepair defect affects uniquely the formation of the coding joints.

The signal joint formation is also affected in the context of adefective V(D)J recombination in all the other analyzed settings.

Moreover, the manifestations of the DNA repair defect seem pretty muchrestricted to the immune system in both human RS-SCID patients and scidmice and do not seem to lead to neurologic disorders or development ofcancer for example, two manifestations that are often associated withdefects in the other players of the NHEJ Roth and Gellert, 2000).

In conclusion, the invention describes the identification and cloning ofthe gene coding for Artemis, a novel actor of the V(D)J recombination.Mutations of Artemis are causing T-B-SCID defects in humans owing to anabsence of repair of the RAG1/2-mediated DNA double strand break.Artemis belongs to a large family of molecules that adopted themetallo-β-lactamase fold as part of their putative catalytic site. Onebranch of this family, which also includes yeast SNM1 and murine PSO2,has appended another domain, which is named βCASP, that may target theactivity of this subgroup of proteins towards nucleic acids and thus,serves the purpose of DNA repair.

Finally, other domains, yet to be defined, are probably responsible fordirecting the various Artemis/SNM1/PSO2 proteins to their specific DNArepair pathways; the dsb repair for Artemis or the DNA-ICL repair forSNM1/PSO₂.

The following examples are intended for illustration purposes only, andshould not be construed as limiting the scope of the invention in anyway

EXAMPLES Example 1 Cloning of the Artemis cDNA

Patients and Cells

13 RS-SCID patients from 11 families were analyzed in this study havingbeen selected for their typical phenotype of autosomal recessive SCIDwith complete absence of peripheral T and B lymphocytes, but presence ofnatural killer cells (Fischer et al., 1997). All the patients showed animpaired V(D)J recombination assay in fibroblasts, and for all RS-SCIDfamilies except P16, P40 and P47 radiosensitivity status could bedetermined on bone marrow cells and/or fibroblasts ((Cavazzana-Calvo etal., 1993; Nicolas et al., 1998; Moshous et al., 2000) and ourunpublished results). Genotyping of the consanguineous families usingpolymorphic microsatellite markers as reported elsewhere (Moshous etal., 2000) concurred in every case with our previously describedlocalization of RS-SCID on chromosome 10p. The study was based on fourpatients of French origin (P1, P3, P6 and P15), one of which (P15) wasborn to related parents, one was of Italian origin (P16), one Greek (P4)and one African (p2). The remaining patients originate from fourconsanguineous Turkish families, two of them being related (P38, P5, P11and P12 respectively). Informed consent was obtained from the familiesprior to this study. Primary fibroblast cell lines were derived fromskin biopsies and pseudo-immortalized with SV40 as described elsewhere(Nicolas et al., 1998) and cultured in RPMI 1640 (GIBCO BRL)supplemented with 15% fetal calf serum.

Artemis cDNA Cloning and Genomic Amplification

First strand cDNA was synthesized from fibroblast RNA. Artemisfull-length coding sequence was amplified by polymerase chain reaction(PCR) on cDNA using the Advantage-GC cDNA PCR Kit (Clontech) accordingto the manufacturers' recommendations and 0083F1(5′-GATCGGCGGCGCTATGAGTT-3′, SEQ ID No 3) and 169F(5′-TGTCATCTCTGTGCAGGTTT-3′, SEQ ID No 4) primers designed from theAA278590 and AI859962 EST sequences. The PCR products were sequenceddirectly on an ABI377 sequencer (Perkin-Elmer) using BigDyeTerminatorCycle Sequencing Ready Reaction (Applied Biosystems) with a series ofinternal oligonucleotides. Because of several alternatively splicedtranscripts, sequencing was also performed on cloned PCR products.Artemis full-length cDNA was subcloned into pIRES-EGFP (Clontech) forsubsequent use in transfection experiments. Genomic structure of theArtemis gene was deduced by alignment of the cDNA sequence to the draftsequence of the Bac 2K17 (AL360083). Series of oligonucleotide primerpairs were designed for the specific amplification of each exon. Exon 4and exon 5 PCR products were cloned into pGemT (Promega) for subsequentuse in Southern blot analysis (see below).

The primers that were used for genomic amplification of the differentexons 1 to 14 correspond to SEQ ID No 5 to SEQ ID No 32. These pairs ofprimers allow the amplification of each exon, as indicated in thesequence listing, the number of the exon being after “Ex”, “F” meaningForward and “R” Reverse.

Results

As part of its effort in sequencing the human chromosome 10, the“Chromosome 10 Mapping Group” at the Sanger Center constructed severalbacterial artificial chromosomes (BACs) contigs covering the entirechromosome 10 (http://www.sanger.ac.uk/HGP/Chr10). Two of these contigs,10ctg1105 and 10ctg23 (FIG. 1A), were of particular interest as theyincluded several BACs bearing the RS-SCID region flanking markersD10S1664 and D10S674, as well as D10S191 and D10S1653, that showed themaximum pairwise LOD scores in A-SCID populations (Li et al., 1998).

A systematic survey of the nucleotide sequences, covering the 24 BACspresent in the two contigs, released by the “Human Chromosome 10Sequencing Group” at the Sanger Center(http://webace.sanger.ac.uk/cgi-bin/ace/simple/10ace) was initiated. Theanalyses were based on in silico nucleotide sequence annotation withGENESCAN (3urge and Karlin, 1997)(http://bioweb.pasteur.fr/seqanal/interfaces/genscan.htnl) and FGENESH(Salamov and Solovyev, 2000) (http://genomic.sanger.ac.uk/gf/gfb.htrnl)softwares, two programs aimed at searching for putative peptide encodinggenes in large genomic DNA sequences. All predicted peptides weresubsequently run against the translated nonredundant GENBANK/EMBLnucleotide databases using TBLASTN. On the release of the AL360083 draftsequence, originated from the 2K17 BAC, FGENESH predicted a 149amino-acids (aa) long peptide (subsequently named “0083” peptide) withinwhich 121 aa turned out to be 35% and 31% identical to the MuSNM1(AAF64472) and Yeast PSO2 (P30620) proteins respectively. The samepeptide prediction was obtained using GENESCAN.

SNM1 and PSO2 are two proteins involved in the reparation of DNA damagescaused by DNA interstrand cross-linking agents (ICL) includingcisplatin, mitomycin C, or cyclophosphamide ((Henriques and Moustacchi,1980; Dronkert et al., 2000) and references therein). Mouse and yeastmutants for SNM1/PSO₂ are hypersensitive to several agents that causeICLs but not to γ-rays in contrast to RS-SCID patients in whomhypersensitivity to γ-rays was found in both bone marrow cells andfibroblasts. The putative “0083” peptide encoding gene represented agood candidate owing to its chromosomal localization and the similarityof the “0083” peptide with DNA-repair proteins, despite the importantdiscrepancy in the radiosensitivity phenotype. The validity of theputative “0083” peptide was established by searching for RNA transcriptsencoding this peptide in the TIGR human Gene Indices database usingTBLASTN(http://www.tigr.org/docs/tigrscripts/nhgi_scripts/tgi_blast.pl?organism=Human).

This query returned the THC535641 index within which the AA278590represented the most 5′ sequence and corresponded to the I.MA.G.E. clone703546. The nucleotide sequence of this clone's opposite end (AA278850)matched the THC483503 index. The A1859962 nucleotide sequence in thissecond index provided the most 3′ extension of the cDNA coding for the“0083” peptide. A complete cDNA was obtained by RT-PCR amplification offibroblast RNA using 0083F1 and 169F primers chosen within the AA278590and AI859962 nucleotide sequences respectively. This cDNA was directlysequenced and cloned into pGemT. Several clones representednon-productive transcripts generated by alternative splicing events(data not shown). The inventors concentrated on one set of clones thatharbored the longest open reading frame. The entire cDNA sequence of2354 bp (SEQ ID No 1) contains an open reading frame (ORF) of 2079 bp.The ATG at position 39 or the ATG at position 60 are assumed torepresent the translational initiation site based on the analysis ofsurrounding nucleotides matching the Kozack consensus for translationinitiation. The encoded protein of 692 or 685 aa, which was namedArtemis (and corresponds to SNM1C), has a predicted molecular weight ofabout 78 kD. The “0083” peptide (corresponding to the peptide coded bynucleotides 348 to 800 of SEQ ID No 1) corresponds to I106H254 (start atnucl. 39) and is part of a larger region that shows similarity to scPSO₂and muSNM1.

It is not possible to fully ascertain that this cDNA represents thefull-length sequence although repeated attempts to further extend the 5′sequence failed. The polyA tail in the 3′ untranslated region of thesequence SEQ ID No 1 is encoded by the genomic sequence and is not theconsequence of RNA processing, suggesting that this cDNA may extendfurther downstream. Functional complementation studies (see below)strongly suggest, however, that the full Artemis ORF has indeed beencloned. The Artemis gene structure (Table 1) was deduced by comparisonof the cDNA sequence to the genomic (AL360083) sequence. Artemis iscomposed of 14 exons with sizes ranging from 52 bp to 1160 bp. Fouralternative exons were identified in various Artemis cDNA clones andresult in non-productive splicings (data not shown).

TABLE 1 exon boundaries of the ARTEMIS gene Splice Splice Lengthacceptor Coding Sequence donor Exon 1 147 bp GCGGTT... ...ACAAAG GtgagtExon 2 52 bp ttttag ATCACA... ...GTGCAG gtaatt Exon 3 85 bp tttcagCTTGAA... ...CGAATT gtaagt Exon 4 60 bp ttacag ATATCT... ...GGAGAGgtaact Exon 5 56 bp ttttag AAGGAA... ...AGTTAT gtaagg Exon 6 102 bptttcag GTTTTT... ...GGGCAG gtactg Exon 7 73 bp tttcag AGTCAA......AGTCGG gtaagt Exon 8 141 bp gtctag GAGGAG... ...GTCCAG gtatgg Exon 9102 bp ccttag GTTCAT... ...CCCAAG gtacgt Exon 10 137 bp ttttag GCAGAG......TGTGAG gtaaga Exon 11 55 bp ctttag GACTGG... ...AGTGAG gtaaga Exon 1289 bp ttctag GTGAGA... ...CGAAAT gtgagt Exon 13 95 bp tttcag ATTAAA......ACTCAG gtaaga Exon 14 1160 bp ccacag AGGAGG... ...AAAAAA

The RS-SCID had previously been assigned to a 6.5 cM chromosomal regionflanked by the DIOS1664 and D10S674 polymorphic markers, a region toolarge for classical cDNA selection studies (Li et al., 1998; Moshous etal., 2000). In silico annotation of draft genomic sequences coveringthis region led the inventors to the identification of the Artemis gene.

Example 2 Expression of Artemis

Southern Blot Analysis

Ten μg of high molecular weight DNA were digested with HindIII orEco88I, run onto 0.7% agarose gel, blotted on nylon membrane(Genescreen) under vacuum and hybridized with P32 labeled Artemis exon 5and exon 4 specific probes.

RNA Expression Analysis

PCR-ready cDNA from several tissues were purchased from Clontech andamplified with Artemis specific primers 0083F4(5′-AGCCAAAGTATAAACCACTG-3′, SEQ ID No 33) and 169F (SEQ ID No 4) ormanufacturer provided GAPDH primers as control and run onto 1% agarosegels. Artemis specific PCR products were blotted onto nylon membrane andhybridized with an internal P³² labeled oligonucleotide, whereas theGAPDH PCR control was revealed by ethidium bromide staining.

Results

Increased radiosensitivity of RS-SCID to γ rays is not restricted to thecells of the immune system but is also a characteristic of fibroblasts,suggesting that Artemis is ubiquitously expressed. This was confirmed byPCR analysis on a panel of 15 cDNAs representing a wide range ofhematopoietic and non-hematopoietic tissues.

The level of Artemis expression is ubiquitous but weak and required 30PCR cycles (38 cycles for the skeletal muscle) to get an appropriatesignal with an internal P32 labeled oligonucleotide, compared to thestrong ethidium bromide staining obtained for the control gene GAPDH.

Low level expression of Artemis could reflect a general property of theSNM1 protein family as the basal expression of mSNM1 in ES cells wasfound very low as well (Dronkert et al., 2000). Of note, compared toother tissues, Artemis expression was not increased in thymus or bonemarrow, the sites of V(D)J recombination.

As expected, given the generalized increased radiosensitivity in RS-SCIDpatients' cells (Cavazzana-Calvo et al., 1993), Artemis demonstrated apleiotropic expression pattern.

Example 3 The Artemis Gene is Mutated in Human RS-SCIDs

Mutation Analysis

Artemis mutation sequence analyses were performed either on cDNAfollowing RT-PCR amplification or on genomic DNA following exon-specificPCR amplification. All PCR products were directly sequenced usingBigDyeTerminator Cycle Sequencing Ready Reaction.

Results

The structure and the sequence of the Artemis gene was analyzed in aseries of 11 RS-SCID families including 13 patients (Table 2 and FIG.1). Three patients (P6, P15, and P40) were characterized by a completeabsence of the Artemis transcript caused by a genomic deletion extendedfrom exon 1 to 4. This mutation can be considered as a complete nullallele.

The same genomic deletion was present on one allele in P1 who carried aC279T nucleotide change on the other allele that led to the formation ofa nonsense codon at R74. Homozygous C279T mutation was also present inP2 and found heterozygous in P4.

Two other genomic deletions were characteristic of this series ofRS-SCID patients. A homozygous deletion spanning exons 5 to 8 in P47 ledto the formation of a cDNA in which exon 4 was spliced in frame to exon9, resulting in a putative protein lacking K96 to Q219. In P3, anheterozygous genomic deletion of exons 5 and 6 resulted in the out offrame splicing of exon 4 to 7 leading to a frameshift at K96. The secondallele in this patient carried an heterozygous G to C nucleotide changein the exon 11 canonical splice donor sequence which caused the out offrame splicing of exon 10 to 12 leading to a frameshift at T300.

Lastly, three other splice donor sequence mutations were identified insix patients. A heterozygous G to A nucleotide change in the exon 10splice donor site in P4 gave rise to the production of a cDNA where thefusion of exon 9 to 12 preserved the open reading frame and potentiallyled to the production of a protein lacking A261 to E317. Homozygous G toT mutation in the exon 5 donor site was found in the siblings P5, P11,and P12 as well as in P38, creating the out of frame splicing of exon 4to 6 and the formation of a frameshift at K96. Although this form ofcDNA lacking exon 5 as a result of alternative splicing event was alsodetected at low level in RNA from normal cells (data not shown), itaccounted for all the cDNAs in P5 and P38. In patient P16, a homozygousdeletion of G818 in exon 9, together with a homozygous C to T changenine nucleotides downstream in the intron caused the formation of aframeshift at A254.

Whenever samples of the patients' parents were available, they weretested for the presence of the mutations. This could confirm theinheritance of the mutations in P5, P11, and P12 as well as P38 and P16by direct sequencing of the exon specific genomic PCR obtained fromparents' DNA (data not shown) which concurs with the autosomal recessiveinheritance.

In summary, all of the 13 RS-SCID patients tested in this series carryhomozygous or heterozygous mutations in the Artemis gene. None of thesemutations were simple missense, and one of them (genomic deletion ofExon 1 to 4) can be considered as a true null allele given the completelack of Artemis transcript in P6, P15, and P40. All mutations arerecapitulated in FIG. 1 and Table 2.

TABLE 2 Mutations of the artemis gene in RS-SCID patients PatientsMutation Effect Status P1 Genomic deletion No RNA Heteroz. (Exons 1-4)C279T R74X* Heteroz P2 C279T R74X Homoz. P3 Genomic deletion K96frameshift Heteroz. (Exons 5-6) Exon-11 splice donor T300 frameshiftHeteroz. (G−>C) P4 C279T R74X Heteroz. Exon-10 splice donor DelA254-E317 Heteroz (G−>A) P5/P11/P12/P38 Exon-5 splice donor K96frameshift Homoz. (G−>T) P6/P15/P40 Genomic deletion No RNA Homoz.(Exons 1-4) P16 del G818 A254 frameshift Homoz. P47 Genomic deletion DelK96-Q219 Homoz. (Exons 5-8) *Stop codon

The first indication that Artemis was indeed the gene involved in theRS-SCID came from the identification of mutations in several patients.Altogether, 8 different alterations of the gene were found in 11families. Although some of the mutations were recurrent, it was notpossible to draw any clear correlation with the geographical origins ofthe patients.

Several interesting features arise from the analysis of these mutations:

-   -   Firstly, three of the identified modifications involve genomic        deletions spanning several exons, leading to frameshift and        appearance of a premature termination in two cases and an        in-frame deletion of 216 aa in one case. This indicates that the        Artemis gene may represent a hot spot for gene deletion.    -   Secondly, none of the mutations consists in simple nucleotide        substitutions generating amino-acid changes, and only one, the        C279T transversion, creates a nonsense mutation. The other        nucleotide changes affect splice donor sequences leading to        either frameshifts in three cases or to in frame deletion of        part of the protein in one case.    -   Thirdly, in three patients (P6, P15, and P40) the genomic        deletion comprises exon 1 to 4 and results in a complete absence        of Artemis encoded cDNA. This deletion, which can be considered        as resulting in a null allele, therefore demonstrates that        Artemis is not an essential protein for viability, in contrast        to XRCC4 and DNA-Ligase-IV for example (Barnes et al., 1998;        Frank et al., 1998; Gao et al., 1998), or that it is partly        redundant.

This information is of particular interest in the setting of a murineknockout counterpart to the human RS-SCID condition. The implication ofArtemis in the RS-SCID condition was unequivocally established bycomplementation of the V(D)J recombination defect in patients'fibroblasts upon transfection of a wt Artemis cDNA (next example).

Example 4 Artemis Complements the RS-SCID V(D)J Recombination Defect

V(D)J recombination Assay

V(D)J recombination assay was performed as previously described (Nicolaset al., 1998). Briefly, 5×10⁶ exponentially growing SV40-transformedskin fibroblasts were electroporated in 400 μl of culture medium(RPMI1640, 10% FCS) with 6 μg of RAG-1 and 4.8 μg of RAG-2 encodingexpression plasmid together with 2.5 μg of either pHRecCJ (coding joint)or pHRecSJ (signal joint) V(D)J extrachromosomal substrates. Theseplasmids carry a LacZ gene interrupted by a DNA stuffer flanked by V(D)Jrecombination signal sequences. Upon recombination, the stuffer DNA isexcised and the LacZ gene reassembled, giving rise to blue bacteriacolonies when plated onto Xgal/IPTG medium. pARTE-ires-EGFP (2.5 μg) wasadded for complementation analysis. Transfected constructs wererecovered after 48 h, reintroduced into DH10B bacteria and plated onXgal/IPTG containing plates. Percentage of recombination was determinedby counting blue and white colonies and calculating the ratio. Bluecolonies were randomly picked and the plasmid DNA sequenced to analyzethe quality of the V(D)J junctions.

Results

The inventors previously demonstrated the absence of V(D)Jrecombination-derived coding joint formation in RS-SCID patients'fibroblasts upon transfection of RAG1 and RAG2 expression constructstogether with extrachromosomal V(D)J recombination substrates specificfor the analysis of the coding (pHRecCJ) joint (Nicolas et al., 1998).In contrast, signal joint formation was always found normal in RS-SCIDfibroblasts ((Nicolas et al., 1998) and Table 3).

The Artemis gene was cloned in the mammalian expression vectorpIres-EGFP and assessed its functional complementation activity in theV(D)J recombination assay in fibroblasts from 7 RS-SCID patients usingthe pHRecCJ substrate (Table 3). In all cases, bacterial blue colonieswere recovered following transfections in the presence of wt Artemis,attesting for the RAG1/2 driven recombination of the substrate, whilevirtually no such colonies were obtained in the absence of exogenous wtArtemis.

The frequencies of recombination events ranged from 1.5×10⁻³ to2.9×10⁻³, which agreed with the 3.2×10⁻³ frequency obtained when using acontrol fibroblast cell line. Sequence analysis of the recovered pHRecCJplasmids forming blue colonies in this assay in fibroblast lines from P1and P40 demonstrated that the junctions were bona fide V(D)J codingjoints with limited trimming of the coding ends similar to thoseobtained in control fibroblasts (not shown).

Altogether, these results indicate that the V(D)J recombination defectin RS-SCID is directly related to the described mutations in the Artemisgene and can be complemented by the introduction of a wt Artemis cDNA inthe patients' fibroblasts. Although transient high level expression ofArtemis did not seem to be toxic, stable transfectants could not bederived to analyze the complementation of the hypersensitivity toionizing radiation. This could be due to a toxicity of long-term highlevel expression of wt Artemis in the transfected fibroblasts. Ananalogous cellular toxicity was previously described upon overexpressionof other human or murine homologs of SNM1 in vitro and may be acharacteristic of this family of proteins (Dronkert et al., 2000). Thisis also in agreement with the physiological low-level RNA expression ofthese genes (see above).

It is interesting to note that the protein consisting of the first 385amino-acids of SEQ ID No 2 was also able to complement the V(D)Jrecombination defect in this assay.

TABLE 3 Complementation of V(D)J Coding-joint formation in RS-SCIDfibroblasts transfected with wt Artemis wt Coding (pHRecCJ) Signal(pHRecSJ) Cell Artemis Blue Blue line plasmid col. Total R* col. TotalR# Control − 50 42,000 3.5 27 10,840 2.5 + 16 15,000 3.2 P1 − 0 34,800<0.03 34 45,600 0.7 + 23 40,000 1.7 P4 − 2 36,300 0.16 336 98,200 3.4 +23 54,000 1.3 P5 − 0 21,400 <0.05 17 3,000 5.7 + 52 61,600 2.5 + 6516,300 11.9 + 48 10,800 13.3 + 48 20,000 7.2 P15 − 2 11,160 0.53 19230,000 6.4 + 16 31,200 1.5 P16 − 0 13,000 <0.08 51 33,200 1.5 + 3013,680 6.6 P40 − 0 40,250 <0.02 nd nd nd + 90 160,000 1.7 + 65 58,8003.3 + 100 42,840 7.0 + 76 29,736 7.6 P47 − 0 28,200 <0.003 25 66,8000.4 + 48 50,000 2.9 *R(coding joints) = 3 × (Blue col.)/(Total) × 1,000;#R(signal joints) = (Blue col.)/(Total) × 1,000

Example 5 Artemis Belongs to the metallo-β-lactamase Superfamily

Database searches using the BLAST2 program with the Artemis aa sequenceas query revealed significant similarities to several proteins,including the yeast PSO2 and murine SNM1 proteins, over the first 360amino acids of Artemis. Subsequent iterations with the PSI-BLAST programhighlighted significant similarities of the first 150 amino acids towell-established members of the metallo-β-lactamase superfamily.

The metallo-β-lactamase fold, first described for the Bacillus cereusβ-lactamase (Carfi et al., 1995), is adopted by various metallo-enzymeswith a widespread distribution and substrate specificity (Aravind,1997). It consists in a four-layered β-sandwich with two mixed β-sheetsflanked by α-helices, with the metal-binding sites located at one edgeof the β-sandwich. Sequence analysis as well as secondary structureprediction for Artemis clearly indicated the conservation of motifstypical of the metallo-β-lactamase fold. This is true in particular forthe amino acids D17, [HXHKDH]33-38 (SEQ ID NO: 35), H115, and D136participating in the metalbinding pocket and representing the catalyticsite of the metallo-β- lactamases. The last metal-binding residue of themetallo-β-lactamases , H225 in Tenotrophomonas maltophiliametallo-β-lactamase (1SML), which is located at the end of a β-strand(strand β12) is absent in Artemis/SNM 1 /PSO2, but could be functionallysubstituted by the aspartic acid D165 of Artemis, also conserved inSNM1/PSO2. The later residue is located at the end of a predictedβ-strand, separated by an α-helix from the strand bearing the precedingmetal-binding residue.

Example 6 In Vitro Mutagenesis of Artemis Gene

The analysis of the Artemis protein sequence revealed revealed theexistence of a putative metallo-β-beta lactamase domain (M1 to R179)which suggests that Artemis could have some catalytic function such ashydrolase. This domain is followed by another domain (E180 to S385) thatwas called β-CASP for β-Lactamase CPSF-Artemis-SNM1-PS02 associateddomain. The β-CASP domain is always associated to the β-lact domain in aseries of proteins with function on the metabolism of nucleic acids (DNArepair, RNA processing . . . ). Finally the last domain (C-ter)comprises E386 to T692. The role and the interdependence of these twodomains was analyzed in vitro by generating mutants and testing theiractivity both on V(D)J recombination and DNA repair.

Results

V(D)J Recombination

The assay is based on the transfection of a fibroblast with Rag1 andRag2 expression construct to analyze the V(D)J recombination of anextrachromosomal substrate (pHRec-CJ or pHRec-CS, see example 4). TheβLact+βCASP region (M1 to S385) is sufficient to complement the V(D)Jrecombination defect in fibroblasts from Artemis deficient (RS-SCID)patients.

However, the βLact domain (M1 to R179) alone does not complement thisdefect. There is no activity either when using βCASP-Cter (E180 to T692)or C-ter (E386 to T692) configurations of Artemis.

The catalytic site of metallo-β-lactamases in bacteria is characterizedby the arrangement of several conserved His and Asp residues which bindZn atoms and confer the hydrolase activity. Many of these residues arealso conserved in Artemis which further suggests the possible hydrolaseactivity of Artemis.

Several of these His and Asp residues were mutated to Val and theresidual function of the protein was analyzed in the V(D)J assay.

D17A, H35A, D37A, and D136A almost completely abolished the function ofArtemis while H165A and H319A reduced its activity. H38A and H151A seemto have no effect.

DNA Repair

This assay is based on the analysis of the sensitivity to γ rays ofRS-SCID fibroblasts transduced with a retroviral vector expressingvarious form of Artemis. While βLact-βCASP/C-ter complements the hyperradiosensitivity of RS-SCID cells, βLact-βCASP has no effect.

CONCLUSIONS

This mutagenesis experiments suggests that Artemis probably has somecatalytic activity, as hypothesized by its homology with bacterialmetallo-β-lactamases.

The βLact+βCASP domain of Artemis carries this catalytic activitysince 1) it can ensure a V(D)J recombination on extrachromosomalsubstrates and 2) mutation of several putative catalytic residues (Hisand Asp) abolish the function.

Nevertheless, it is interesting to notice that the bLact+bCASP domaindoes not seem to be not sufficient by itself to allow for Artemisactivity in the context of the whole chromosome, as the full lengthArtemis protein, or at least part of the C-term domain, seems requiredto complement the radiosensitivity phenotype.

This suggests that Artemis probably interacts with other proteins in thecontext of its substrate within chromatin.

Based on this result, it is possible that the V(D)J recombinationactivity on endogenous (in chromatin vs. extrachromosomal) Ig and TCRgenes may also require the entire Artemis protein.

This situation is somehow similar to that of the Rag2 protein. In Rag2,a core region is necessary and sufficient to drive recombination ofextrachromosomal substrates while it is ineffective in the rearrangementof endogenous loci.

Example 7 Analysis of Patients with Partial Artemis Deficiency

Observation

In a survey of SCID patients, the inventors came across 4 cases (in 2families) presenting a complex phenotype resembling Ataxiatelangectasia, although without clear ataxia. These patients sufferedfrom severe lymphopenia and hypogammaglobulinemia.

In some of these patients the finding of chromosomal aberrations onlymphocytes suggested that they might present a DNA-repair defect, whichwas further confirmed by showing hypersensitivity of Bone marrow to γrays.

V(D)J recombination in fibroblasts of two of these patients(representing the two families) was either absent or severely diminishedand was restored to normal level by addition of wt Artemis, as inexample 4.

Lastly, it was possible to demonstrate that the Artemis gene was mutatedin both cases, resulting in premature stop codons at T432 and D451respectively (at the beginning of the C-Ter domain).

This observation somehow confirms the results of the in vitromutagenesis (see example 6) showing that Artemis protein devoid of thecomplete C-Ter domain can still have recombination activity in thechromatin context in vivo (these patients do have some lymphocytes) butthe efficiency on endogenous loci is very weak (these patients arestrongly lymphopenic).

In two patients of the first family, the immune deficiency wasaccompanied by the development of very aggressive and disseminatedlymphoproliferative syndrome (SLP) associated with the presence of EBV.

These SLPs can probably be assimilated to true B cell lymphomas based ontheir clonality and their aggressiveness.

CONCLUSION

The main conclusion of this observation is that Artemis can probably beconsidered as a “Caretaker” since its defect is apparently associatedwith the development of B cell lymphomas.

This idea is comforted by the literature that shows (in several reports)that defects in other factors of the V(D)J recombination/DNA repair(such as Ku80, DNA-PK, XRCC4, DNA-LigaseIV) in animal models alwaysleads to the development of pro-B cell lymphomas when introduced on aP53−/− background, establishing them as true Caretakers.

This hypothesis can be experimented in an animal model.

REFERENCES

-   Aravind, L. (1997). An evolutionary classification of the    matallo-β-lactamase fold. In Silico Biology.-   Barnes et al. (1998). Targeted disruption of the gene encoding DNA    ligase IV leads to lethality in embryonic mice. CurrBiol 31,    1395-1398.-   Biedermann et al. (1991). Scid mutation in mice confers    hypersensitivity to ionizing radiation and a deficiency in DNA    double-strand break repair. Proc. Natl. Acad. Sci. 88, 1394-1397.-   Blunt et al. (1996). Identification of a nonsense mutation in the    carboxyl-terminal region of DNA-dependant protein kinase catalytic    subunit in the scid mouse. Proc. Natl. Acad. Sci. USA 93,    10285-10290.-   Bosma et al. (1983). A severe combined immunodeficiency mutation in    the mouse. Nature 301, 527-30.-   Burge, C., and Karlin, S. (1997). Prediction of complete gene    structures in human genomic DNA. J Mol Biol 268, 78-94.-   Cameron et al. (1999). Crystal structure of human glyoxalase II and    its complex with a glutathione thiolester substrate analogue.    Structure Fold Des 7, 1067-78.-   Carfi et al. (1995). The 3-D structure of a zinc    metallo-beta-lactamase from Bacillus cereus reveals a new type of    protein fold. Embo J 14, 4914-21.-   Carney et al. (1998). The hMre11/hRad50 protein complex and Nijmegen    breakage syndrome: linkage of double-strand break repair to the    cellular DNA damage response. Cell 93, 477-86.-   Cavazzana-Calvo et al. (1993). Increased radiosensitivity of    granulocyte macrophage colony-forming units and skin fibroblasts in    human autosomal recessive Severe Combined Immunodeficiency. J. Clin.    Invest. 91, 1214-1218.-   Chen et al. (2000). Response to RAG-mediated V(D)J cleavage by NBS1    and gamma-H2AX. Science 290, 1962-5.-   Corneo et al. (2000). 3D clustering of human RAG2 gene mutations in    Severe Combined Immune Deficiency (SCID). J. Biol. Chem. 275,    12672-12675.-   Danska et al. (1996). Biochemical and genetic defects in the    DNA-dependent protein kinase in murine scid lymphocytes. Mol Cell    Biol 16, 5507-17.-   Dronlcert et al. (2000). Disruption of mouse SNM1 causes increased    sensitivity to the DNA interstrand cross-linking agent mitomycin C.    Mol Cell Biol 20, 4553-61.-   Eastman et al. (1996). Initiation of V(D)J recombination in vitro    obeying the 12/23 rule. nature 380, 85-88.-   Fischer et al. (1997). Naturally occurring primary deficiecies of    the immune system. Annu. Rev. Immunol. 15, 93-124.-   Frank et al. (I 1998). Late embryonic lethality and impaired V(D)J    recombination in mice lacking DNA ligase IV. Nature 396, 173-7.-   Fugmann et al. (2000). Identification of two catalytic residues in    RAG1 that define a single active site within the RAG1/RAG2 protein    complex. Mol Cell 5, 97-107.-   Fulop, G. M., and Phillips, R. A. (1990). The scid mutation in mice    causes a general defect in DNA repair. Nature 347, 479-482.-   Gao et al. (1998). A targeted DNA-PKcs-null mutation reveals    DNA-PK-independent functions for KU in V(D)J recombination. Immunity    9, 367-76.-   Gao et al. (1998). A critical role for DNA end-joining proteins in    both lymphogenesis and neurogenesis. Cell 95, 891-902.-   Gottlieb, T. M., and Jackson, S. P. (1993). The DNA-dependant    protein kinase: requirement for DNA ends and association with Ku    antigen. Cell 72, 132-142.-   Haber, J. E. (2000). Partners and pathways repairing a double-strand    break. Trends Genet 16, 259-64.-   Hendrickson et al. (1991). A link between double-strand break    related repair and V(D)J recombination: the scid mutation. Proc.    Natl. Acad. Sci. USA 88, 4061-4065.-   Henriques, J. A., and Moustacchi, E. (1980). Isolation and    characterization of pso mutants sensitive to photo-addition of    psoralen derivatives in Saccharomyces cerevisiae. Genetics 95,    273-88.-   Hu, D.C., Gahagan, S., Wara, D. W., Hayward, A., and Cowan, M. J.    (1988). Congenital severe combined immunodeficiency disease (SCID)    in American Indians. Pediatr. Res. 24, 239.-   Jackson, S. P., and Jeggo, P. A. (1995). DNA double-strand break    repair and V(D)J recombination: involvement of DNA-PK. Trends    Biochem Sci 20, 412-5.-   Jhappan, C., Morse, H. C. r., Fleischmann, R. D., Gottesman, M. M.,    and Merlino, G. (1997). DNA-PKcs: a T-cell tumour suppressor encoded    at the mouse scid locus. Nat Genet 17, 483-6.-   Junop, M. S., Modesti, M., Guarne, A., Ghirlando, R., Gellert, M.,    and Yang, W. (2000). Crystal structure of the xrcc4 DNA repair    protein and implications for end joining. Embo J 19, 5962-70.-   Kim, D. R., Dai, Y., Mundy, C. L., Yang, W., and Oettinger, M. A.    (1999). Mutations of acidic residues in RAG1 define the active site    of the V(D)J recombinase. Genes Dev 13, 3070-80.-   Landree, M. A., Wibbenmeyer, J. A., and Roth, D. B. (1999).    Mutational analysis of RAG1 and RAG2 identifies three catalytic    amino acids in RAG1 critical for both cleavage steps of V(D)J    recombination. Genes Dev 13, 3059-69.-   Li, et al. (1998). The gene for severe combined immunodeficiency    disease in Athabascan-speaking Native Americans is located on    chromosome 10p. Am J Hum Genet 62, 136-44.-   Li et al. (1995). The Xrcc4 Gene Encodes a Novel Protein Involved In    Dna Double-Strand Break Repair and V(D)J Recombination. Cell 83,    1079-1089.-   McBlane et al. (1995). Cleavage at a V(D)J recombination signal    requires only RAG1 and RAG2 proteins and occurs in two steps. Cell    83, 387-95. Mombaerts et al. (1992). RAG-1 deficient mice have no    mature B and T lymphocytes. Cell 68, 869-877.-   Moshqus et al. (2000). A new gene involved in DNA double-strand    break repair and V(D)J recombination is located on human chromosome    10p. Hum Mol Genet 9, 583-588.-   Nicolas et al. (1996). Lack of detectable defect in DNA    double-strand break repair and DNA-dependant protein kinase activity    in radiosesitive human severe combined immunodeficiency fibroblasts.    Eur. J. Immunol. 26, 1118-1122.-   Nicolas et al. (1998). A human SCID condition with increased    sensitivity to ionizing radiations and impaired V(D)J rearrangements    defines a new DNA Recombination/Repair deficiency. J. Exp. Med 188,    627-634.-   Nussenzweig et al. (1996). Requirement for Ku80 in growth and    immunoglobulin V(D)J recombination. nature 382, 551-555.-   Oettinger et al. (1990). RAG-1 and RAG-2, adjacent genes that    synergistically activate V(D)J recombination. Science 248,    1517-1523.-   Paull et al. (2000). A critical role for histone H2AX in recruitment    of repair factors to nuclear foci after DNA damage. Curr Biol 10,    886-95.-   Robins, P., and Lindahl, T. (1996). DNA ligase IV from HeLa cell    nuclei. J Biol Chem 271, 24257-61.-   Rogakou et al. (1999). Megabase chromatin domains involved in DNA    double-strand breaks in vivo. J Cell Biol 146, 905-16.-   Rogakou et al. (1998). DNA double-stranded breaks induce histone    H2AX phosphorylation on serine 139. J Biol Chem 273, 5858-68.-   Roth, D. B., and Craig, N. L. (1998). VDJ recombination: a    transposase goes to work. Cell 94, 411-4.-   Roth, D. B., and Gellert, M. (2000). New guardians of the genome.    Nature 404, 823-5.-   Roth et al. (1992). V(D)J recombination: Broken DNA molecules with    covalently sealed (hairpin) coding ends in scid mouse thymocytes.    Cell 70, 983-991.-   Salamov, A. A., and Solovyev, V. V. (2000). Ab initio gene finding    in Drosophila genomic DNA. Genome Res 10, 516-22.-   Schatz et al. (1989). The V(D)J recombination activating gene,    RAG-1. Cell 59, 1035-1048.-   Schlissel et al. (1993). Double-strand signal sequence breaks in    V(D)J recombination are blunt, 5′-phospborylated, RAG-dependent, and    cell cycle regulated. Genes Dev 7, 2520-32.-   Schwarz et al. (1996). RAG mutations in human B cell-negative SCID.    Science 274, 97-99.-   Shin et al. (1997). A Kinase-Negative Mutation Of Dna-Pkcs In Equine    Scid Results In Defective Coding and Signal Joint Formation. J.    Immunol. 158, 3565-3569.-   Shinkai et al. (1992). RAG-2 deficient mice lack mature lymphocytes    owing to inability to initiate V(D)J rearrangement. Cell 68,    855-867.-   Taccioli et al. (1998). Targeted disruption of the catalytic subunit    of the DNA-PK gene in mice confers severe combined immunodeficiency    and radiosensitivity. Immunity 9, 355-66.-   Taccioli et al. (1993). Impairement of V(D)J recombination in    double-strand break repair mutants. Science 260, 207-210.

1. An isolated nucleic acid molecule selected from the group consistingof: a) SEQ ID NO:1, nucleotides 39-2114 of SEQ ID NO:1, or nucleotides60-2114 of SEQ ID NO:1 and b) an isolated and purified nucleic acidcomprising the nucleic acid of a).
 2. A vector comprising the nucleicacid molecule of claim
 1. 3. An isolated host cell comprising the vectorof claim
 2. 4. A process of producing a protein involved in V(D)Jrecombination and/or DNA repair comprising the steps of: a) expressingthe nucleic acid molecule of claim 1 in a suitable host to synthesize aprotein involved in V(D)J recombination and/or DNA repair and b)isolating the protein involved in V(D)J recombination and/or DNA repair.5. An isolated nucleic acid molecule that is the complement of theisolated nucleic acid molecule of claim
 1. 6. An isolated protein orpeptide coded by the nucleic acid of claim
 1. 7. A monoclonal orpolyclonal antibody that specifically recognizes the protein or peptideof claim
 6. 8. A pharmaceutical composition comprising apharmaceutically acceptable excipient with at least one of the nucleicacid of claim 1 or
 5. 9. The isolated nucleic acid molecule of claim 1,wherein said isolated nucleic acid molecule comprises nucleotides39-1193 or nucleotides 60-1193 of SEQ ID NO:
 1. 10. A pharmaceuticalcomposition comprising a pharmaceutically acceptable excipient with atleast one of the vector of claim 2, the host cell of claim 3, theprotein of claim 6, the antibody of claim 7, or a combination thereof.